November 05

Add to Calendar 2019-11-05 16:00:00 2019-11-05 17:00:00 America/New_York Anindya De: Junta correlation is testable. Abstract: A Boolean function f on the n-dimensional hypercube is saidto be a k-junta if it is dependent only on some k coordinates of theinput. These functions have been widely studied in the context ofPCPs, learning theory and property testing. In particular, a flagshipresult in property testing is that k-juntas can be tested with\tilde{O}(k) queries -- i.e., there is an algorithm which when givena black box to f, makes only \tilde{O}(k) queries and decides betweenthe following two cases:1. f is a k-junta.2. f is at least 0.01 far from every k-junta g in Hamming distance.Surprisingly, the query complexity is completely independent of theambient dimension n. In this work, we achieve the same qualitativeguarantee for the noise tolerant version of this problem. Inparticular, we give a 2^k query algorithm to distinguish between thefollowing two cases.1. f is 0.48-close to some k-junta.2. f is at least 0.49 far from every k-junta.The algorithm and its proof of correctness are simple and modularemploying some classical tools like "random restrictions" withelementary properties of the noise operator on the cube.Joint work with Elchanan Mossel and Joe Neeman. Patil Kiva G449

September 24

Add to Calendar 2019-09-24 16:00:00 2019-09-24 17:00:00 America/New_York László Végh: A Strongly Polynomial Algorithm for Linear Exchange Markets Abstract: In this talk, I will present the first strongly polynomial algorithm for computing an equilibrium in exchange markets with linear utilities. The exchange market model has been extensively studied since its introduction by Walras in 1874. Our starting point is the previous weakly polynomial combinatorial algorithm by Duan and Mehlhorn. We use a variant of this algorithm to gradually reveal new variables that are in the support of every equilibrium. A key subroutine used for re-initializing this algorithm reduces to solving a linear program (LP). Even though we are unable to solve this LP in strongly polynomial time, we show that it can be approximated by a simpler LP with two variables per inequality that is solvable in strongly polynomial time. This turns out to be good enough to make the overall algorithm run in strongly polynomial time.This is based on joint work with Jugal Garg. Kiva/G449

May 21

Add to Calendar 2019-05-21 16:00:00 2019-05-21 17:00:00 America/New_York Sublinear Algorithms for (Delta+1) Vertex Coloring Abstract: In this talk, we will examine a classical graph coloring problem through the lens of sublinear algorithms—these are algorithms that use computational resources that are substantially smaller than the size of the input on which they operate. It is well-known that any graph with maximum degree Delta admits a proper vertex coloring with (Delta + 1) colors that can be found via a simple sequential greedy algorithm in linear time and space. But can one find such a coloring via a sublinear algorithm? We will present results that answer this question in the affirmative for several well-studied models of sublinear computation, most notably graph streaming and sublinear time algorithms. We obtain these results by proving a palette sparsification theorem which says that if every vertex independently samples O(log n) colors from the available Delta+1 colors, then almost certainly a (Delta + 1) coloring can be found using only the sampled colors. We then show that this palette sparsification theorem naturally leads to essentially optimal sublinear algorithms in each of the above-mentioned models of computation.Based on joint work with Yu Chen and Sanjeev Khanna. 32-G449

May 16

Add to Calendar 2019-05-16 16:00:00 2019-05-16 17:00:00 America/New_York New Problems and Perspectives on Learning, Testing, and Sampling in the Small Data Regime Abstract: I will discuss several new problems related to the general challenge of understanding what conclusions can be made, given a dataset that is relatively small in comparison to the complexity or dimensionality of the underlying distribution from which it is drawn. In the first setting we consider the problem of learning a population of Bernoulli (or multinomial) parameters. This is motivated by the ``federated learning" setting where we have data from a large number of heterogeneous individuals, who each supply a very modest amount of data, and ask the extent to which the number of data sources can compensate for the lack of data from each source. Second, we will discuss the problem of estimating the ``learnability'' of a dataset: given too little labeled data to train an accurate model, we show that it is often possible to estimate the extent to which a good model exists. Specifically, given labeled data pairs (x, y) drawn from some unknown distribution over such pairs, it is possible to estimate how much of the variance of y can be explained via the best linear function of x, even in the regime where it is impossible to approximate that linear function. Finally, I will introduce the problem of data "amplification". Given n independent draws from a distribution, D, to what extent is it possible to output a set of m > n datapoints that are indistinguishable from m i.i.d. draws from D? Curiously, we show that nontrivial amplification is often possible in the regime where n is too small to learn D to any nontrivial accuracy. We also discuss connections between this setting and the challenge of interpreting the behavior of GANs and other ML/AI systems. This talk will also highlight a number of concrete and more conceptual open directions in all three veins. This work is based on several papers, with Weihao Kong, and with Brian Axelrod, Shivam Garg, and Vatsal Sharan. 32-G463 (Star)

April 23

Add to Calendar 2019-04-23 16:00:00 2019-04-23 17:00:00 America/New_York Yoram Singer: Memory-Efficient Adaptive Optimization for Humungous-Scale Learning Abstract:Adaptive gradient-based optimizers such as AdaGrad and Adam are among the methods of choice in modern machine learning. These methods maintain second-order statistics of each model parameter, thus doubling the memory footprint of the optimizer. In behemoth-size applications, this memory overhead restricts the size of the model being used as well as the number of examples in a mini-batch. We describe a novel, simple, and flexible adaptive optimization method with sublinear memory cost that retains the benefits of per-parameter adaptivity while allowing for larger models and mini-batches. We give convergence guarantees for our method and demonstrate its effectiveness in training some of the largest deep models used at Google.Bio:Yoram Singer is the head of Principles Of Effective Machine-learning (POEM) research group in Google Brain and a professor of Computer Science at Princeton. He was a member of the technical staff at AT&T Research from1995 through 1999 and an associate professor at the Hebrew University from 1999 through 2007. He is a fellow of AAAI. His research on machine learning algorithms received several awards. Patil/Kiva G449 (Gates Bldg, Stata)

April 09

Add to Calendar 2019-04-09 16:00:00 2019-04-09 17:00:00 America/New_York Sam Hopkins: How to Estimate the Mean of a Heavy-Tailed Vector in Polynomial Time Abstract: We study polynomial time algorithms for estimating the mean of a multivariate random vector under very mild assumptions: we assume only that the random vector X has finite mean and covariance. This allows for X to be heavy-tailed. In this setting, the radius of confidence intervals achieved by the empirical mean are exponentially larger in the case that X is Gaussian or sub-Gaussian. That is, the empirical mean is poorly concentrated.We offer the first polynomial time algorithm to estimate the mean of X with sub-Gaussian-size confidence intervals under such mild assumptions. That is, our estimators are exponentially better-concentrated than the empirical mean. Our algorithm is based on a new semidefinite programming relaxation of a high-dimensional median. Previous estimators which assumed only existence of finitely-many moments of X either sacrifice sub-Gaussian performance or are only known to be computable via brute-force search procedures requiring time exponential in the dimension.Based on https://arxiv.org/abs/1809.07425 to appear in Annals of Statistics Patil/Kiva G449

March 12

Add to Calendar 2019-03-12 16:00:00 2019-03-12 17:00:00 America/New_York Arkadev Chattopadhyay: The Log-Approximate-Rank Conjecture is False Abstract: We construct a simple and total XOR function F on 2n variables that has only O(n).spectral norm, O(n^2) approximate rank and O(n^{2.5}) approximate nonnegative rank. Weshow it has polynomially large randomized bounded-error communication complexity ofOmega(sqrt(n)). This yields the first exponential gap between the logarithm of theapproximate rank and randomized communication complexity for total functions. Thus, Fwitnesses a refutation of the Log-Approximate-Rank Conjecture which was posed by Lee andShraibman (2007) as a very natural analogue for randomized communication of thestill unresolved Log-Rank Conjecture for deterministic communication. The best knownprevious gap for any total function between the two measures was a recent 4th-powerseparation by Göös, Jayram, Pitassi and Watson (2017).Additionally, our function F refutes Grolmusz's Conjecture (1997) and a variant of theLog-Approximate-Nonnegative-Rank Conjecture, suggested by Kol, Moran, Shpilka andYehudayoff (2014), both of which are implied by the LARC. The complement of F hasexponentially large approximate nonnegative rank. This answers a question of Lee (2012)and Kol et al. (2014), showing that approximate nonnegative rank can be exponentiallylarger than approximate rank. The function F also falsifies a conjecture about paritymeasures of Boolean functions made by Tsang, Wong, Xie and Zhang (2013). The latterconjecture implied the Log-Rank Conjecture for XOR functions.Remarkably, after our manuscript was published in the public domain, two groups ofresearchers, Anshu-Boddu-Touchette (2018) and Sinha-de-Wolf (2018), showed independentlythat the function F even refutes the Quantum-Log-Approximate-Rank Conjecture.(This is joint work with Nikhil Mande and Suhail Sherif) Patil/Kiva G449