Project

Geometry in Large-Scale Machine Learning

Data often has geometric structure which can enable better inference; this project aims to scale up geometry-aware techniques for use in machine learning settings with lots of data, so that this structure may be utilized in practice.

Data often has geometric structure which can enable better machine learning outcomes if utilized properly. Machine learning is concerned with probability distributions, which encode uncertainty in e.g. a prediction, and Optimal Transport gives a principled way to compare distributions. Optimal Transport is a mathematical tool that uses the geometric properties of the data to create geometric structure on distributions, and has been applied recently in many settings in machine learning. Unfortunately, the applicability of these methods is still limited by computational bottlenecks: it is simply faster to ignore geometric structure in the data. Our goal is to remedy this situation, by developing algorithms which are efficient, scalable, and ideally can also be parallelized to be run on many computers in a communication-efficient way. In particular, some problems for which we are working on fast algorithms include: geometrically aggregating probability distributions from different data sources or computers in a cluster, and clustering and reasoning about data using different notions of distance based on the provenance and properties of the data (we want to treat data lying on a graph differently than data lying on a sphere).

Group

Machine Learning

Contact us

If you would like to contact us about our work, please refer to our members below and reach out to one of the group leads directly.

Last updated Apr 28 '20

Research Areas

AI & ML

Graphics & Vision

Impact Areas

Big Data

Internet of Things

Project

Geometry in Large-Scale Machine Learning

Group

Contact us

Research Areas

Impact Areas

Group

Members

Justin Solomon

Stefanie Jegelka