Learning with Generalized Negative Dependence: Probabilistic Models of Diversification for Machine Learning

Speaker

CSAIL and LIDS
This thesis establishes negative dependence as a powerful and computationally efficient framework to analyze machine learning problems that require a theoretical model of diversification. Examples of such problems include experimental design and model compression: subset-selection problems that require carefully balancing the quality of each selected element with the diversity of the subset as a whole. Negative dependence, which models the behavior of “repelling” random variables, provides a rich mathematical framework for the analysis of such problems.

The first part of this thesis develops scalable sampling and learning algorithms for determinantal point processes (DPPs), popular negatively dependent measures with many applications to machine learning. We introduce a theoretically-motivated generative deep network for DPP-like samples over arbitrary ground sets. To address the learning problem, we show that maximum likelihood estimation for Dpps is drastically sped up with Kronecker kernels, and is further enriched by negative samples.

The second part of this thesis leverages negative dependence for core problems in machine learning. We begin by deriving a generalized form of volume sampling (GVS) based on elementary symmetric polynomials, and prove that the induced measures exhibit strong negative dependence properties. We then show that classical forms of optimal experimental design can be cast as optimization problems based on GVS, for which we derive randomized and greedy algorithms to obtain the associated designs. Finally, we introduce exponentiated strongly Rayleigh measures, which allow for simple tuning of the strength of repulsive forces between similar items while still enjoying fast sampling algorithms. The great flexibility of exponentiated strongly Rayleigh measures makes them an ideal tool for machine learning problems that benefit from negative dependence theory.