Currently, in machine learning there is intense interest in nonconvex
optimization. This interest is fueled by the rise of deep neural
networks, and also by other more complex tasks in related areas.
Although an understanding of why neural networks work so well remains
elusive, there has been impressive progress in algorithms, software, and
systems for nonconvex optimization.
But in today's talk, I want to take a step back from algorithmic
advances (fast stochastic gradient, escaping saddle-point, etc.) --- I
want to instead draw your attention to a new set of tools that expand
our repertoire of nonconvexity. In particular, my focus will be on using
geometry to develop a rich subclass of nonconvex problems that can be
solved to global optimality (or failing that, at least solved
numerically more efficiently).
This subclass is built on the notion of geodesic convexity, a concept
that generalizes the usual vector-space (linear) convexity to nonlinear
spaces. I will outline how geometric thinking leads to improved models
or insights for fundamental tasks in machine learning and statistics,
including large-scale principal components analysis, metric learning,
and Gaussian mixture models. I will outline both theoretical and
practical aspects, wider connections of our results, and conclude with a
broad outlook and open problems.