Add to Calendar
2023-12-07 16:00:00
2023-12-07 17:00:00
America/New_York
New Circuit Lower Bounds via Solving Range Avoidance
Abstract: Almost all n-bit Boolean functions have near-maximum (2^n/n) circuit complexity, and a central challenge of complexity theory is to pinpoint one such hard function. Formally, the goal is to identify the smallest complexity class containing a language with exponential circuit complexity. Despite being extremely important, there had been essentially no progress on this question since the four-decade-old work of Kannan. In particular, it remained open whether Sigma_2 E, an important frontier class in complexity theory, contains a language with exponential circuit complexity.In a recent paper with Hanlin Ren and Shuichi Hirahara ([CHR'23]), we finally resolved this long-standing open question by showing that Sigma_2 E requires near-maximum (2^n/n) circuit complexity. The lower bound also extends to S_2E/1 (symmetric exponential time with one bit of advice). Our lower bound was further simplified and improved by an exciting recent work by Zeyong Li [Li'23], who showed that S_2E (and therefore Sigma_2 E) requires a maximum circuit complexity on all input lengths.In this talk, I will explain the new lower bounds for Sigma_2 E, which are corollaries of new algorithms for range avoidance. In particular, I will first describe an algorithm for range avoidance (from [CHR'23]) that works on infinitely many input lengths (which follows the iterative win-win paradigm). I will then explain how [Li'23] significantly simplifies and improves the CHR'23 algorithm to work on all input lengths. This improvement also allows [Li'23] to answer a recent open question posed by Vyas and Williams, giving a quasi-polynomial-size depth-3 AC^0 circuits for the Missing String problem.
32-G575
December 07
December 06
Add to Calendar
2023-12-06 16:00:00
2023-12-06 17:00:00
America/New_York
Algorithms for Generalized Clustering: Min-Max Objective to Cascaded Norms
Abstract: In this talk, I will consider the k-clustering problem with min-max objective: Given an input set of points P partitioned into L predefined groups, the aim is to find a k-clustering that minimizes the maximum average clustering cost (e.g., k-means cost) across these groups, reflecting fairness in applications like polling station or vaccination sites placement. I present a O(log L/ log log L)-approximation for the clustering problem with min-max objective, which exponentially improves over the previously known O(L)-approximation and is optimal up to a constant factor.Moreover, I will discuss the generalization of the problem to (p,q)-cascaded norms and arbitrary weight functions over the points. As an application, this allows a “relaxed” way of enforcing the fairness requirement, as its objective interpolates between the objective of classic k-clustering and that of socially fair clustering. Moreover, the problem is closely related to other well-known problems in combinatorial optimization such as densest k-subgraph and min-k union.Bio: Ali Vakilian is a Research Assistant Professor at TTIC. His research interests include algorithms for massive data, learning-augmented algorithms, and algorithmic fairness. Ali received his Ph.D. from MIT, where he was advised by Erik Demaine and Piotr Indyk. He completed his MS studies at UIUC where he was a recipient of the Siebel Scholar award.
32-G575
November 27
Add to Calendar
2023-11-27 16:00:00
2023-11-27 17:00:00
America/New_York
Optimal PAC Bounds without Uniform Convergence
Abstract: In statistical learning theory, the problem of determining the optimal sample complexity of binary classification was a longstanding challenge only recently resolved by Hanneke. Unfortunately, these arguments rely on the so-called uniform convergence principle which limits its applicability to more general learning settings. In this talk, we will discuss a new technique that overcomes this limitation and allows us to derive optimal sample complexity bounds for settings where uniform convergence is provably suboptimal and in some cases, does not yield any quantitative finite sample guarantees, e.g. multiclass classification, partial concept classification, and bounded regression. Our bounds resolve a few open questions in the literature, and our learning algorithm is based on a novel analysis of the online-to-batch framework of Littlestone and the celebrated one-inclusion graph algorithm of Haussler, Littlestone, and Warmuth.Based on joint work with Ishaq Aden-Ali, Abhishek Shetty, and Nikita Zhivotovskiy.
32-D507
November 15
Add to Calendar
2023-11-15 16:00:00
2023-11-15 17:00:00
America/New_York
Direct Product Testing on High Dimensional Expanders
Abstract: The problem of testing direct product functions lies at the intersection of many areas within theoretical computer science, such as error correcting codes, probabilistically checkable proofs (PCPs), hardness amplification and property testing. We want to efficiently encode a function from [n] to {0,1} using local views in a way that admits local testability and the direct product encoding gives us the restriction of f on various subsets of [n]. A set system is said to support a direct product test when the following property holds: whenever a natural 2-query test passes with noticeable probability, the encoding is correlated to a direct-product encoding. We study the question of what hypergraph expansion properties are required of the set systems that support a direct product test in the list-decoding regime. In contrast to the unique-decoding regime, we show that spectral expansion is insufficient and the set-system must additionally possess a form of topological expansion called coboundary expansion. This also turns out to be sufficient, thus giving us a characterization of such set systems. Building set systems with these expansion properties would thus lead to the ultimate form of efficient direct product testers and perhaps also allow for efficient hardness amplification.Based on joint work with Dor Minzer (https://eccc.weizmann.ac.il/report/2023/120/)
32-G575
November 01
Add to Calendar
2023-11-01 16:00:00
2023-11-01 17:00:00
America/New_York
Clip-OGD: An Experimental Design for Adaptive Neyman Allocation in Sequential Experiments
Abstract: From clinical trials and public health to development economics and political science, randomized experiments stand out as one of the most reliable methodological tools, as they require the fewest assumptions to estimate causal effects. Adaptive experiment designs – where experimental subjects arrive sequentially and the probability of treatment assignment can depend on previously observed outcomes – are becoming an increasingly popular method for causal inference, as they offer the possibility of improved precision over their non-adaptive counterparts. However, in simple settings (e.g. two treatments) the extent to which adaptive designs can improve precision is not sufficiently well understood.In this talk, I present my recent work on the problem of Adaptive Neyman Allocation, where the experimenter seeks to construct an adaptive design which is nearly as efficient as the optimal (but infeasible) non-adaptive Neyman design which has access to all potential outcomes. I will show that the experimental design problem is equivalent to an adversarial online convex optimization problem, suggesting that any solution must exhibit some amount of algorithmic sophistication. Next, I present Clip-OGD, an experimental design that combines the online gradient descent principle with a new time-varying probability-clipping technique. I will show that the Neyman variance is attained in large samples by showing that the expected regret of the online optimization problem is bounded by O(\sqrt{T}), up to sub-polynomial factors. Even though the design is adaptive, we construct a consistent (conservative) estimator for the variance, which facilitates the development of valid confidence intervals. Finally, we demonstrate the method on data collected from a micro-economic experiment.
32-G575
October 25
Fast (distributional) swap regret and applications to approximate correlated equilibria
Columbia Uniersity
Add to Calendar
2023-10-25 16:00:00
2023-10-25 17:00:00
America/New_York
Fast (distributional) swap regret and applications to approximate correlated equilibria
Abstract: We give a simple and computationally efficient algorithm that, for any constant ε>0, obtains εT distributional swap regret within only T = polylog(n) rounds; this is an exponential improvement compared to the super-linear number of rounds required by the state-of-the-art algorithm, and resolves the main open problem of [Blum-Mansour JMLR'07]. Our algorithm has an exponential dependence on ε, but we prove a new, matching lower bound. Our algorithm for swap regret implies faster convergence to ε Correlated Equilibrium (ε-CE) in several regimes: For normal form two-player games with n actions, it implies the first uncoupled dynamics that converges to the set of ε-CE in polylogarithmic rounds; a polylog(n)-bit communication protocol for ε-CE in two-player games (resolving an open problem mentioned by [Babichenko-Rubinstein STOC'2017, Ganor-CS APPROX'18, Goos-Rubinstein FOCS'2018}; and an $\tilde{O}(n)$-query algorithm for ε-CE (resolving an open problem of [Babichenko 2020] and obtaining the first separation between ε-CE and ε-Nash equilibrium in the query complexity model). For extensive-form games, our algorithm implies a PTAS for normal form correlated equilibria, a solution concept widely conjectured to be computationally intractable (e.g. [Stengel-Forges MOR'08, Fujii'23]). Joint work with Aviad Rubinstein
32-G575
October 18
On Provable Copyright Protection for Generative Models
Harvard University
Add to Calendar
2023-10-18 16:00:00
2023-10-18 17:00:00
America/New_York
On Provable Copyright Protection for Generative Models
Abstract: There is a growing concern that learned conditional generative models may output samples that are substantially similar to some copyrighted data C that was in their training set. We give a formal definition of near access-freeness (NAF) and prove bounds on the probability that a model satisfying this definition outputs a sample similar to C, even if C is included in its training set. Roughly speaking, a generative model p is k-NAF if for every potentially copyrighted data C, the output of p diverges by at most k-bits from the output of a model q that did not access C at all. We also give generative model learning algorithms, which efficiently modify the original generative model learning algorithm in a black box manner, that output generative models with strong bounds on the probability of sampling protected content. Furthermore, we provide promising experiments showing minimal degradation in output quality while ensuring strong protections against sampling protected content.Joint work with Sham Kakade and Boaz Barak (https://arxiv.org/abs/2302.10870)
32-G575
October 11
Add to Calendar
2023-10-11 16:00:00
2023-10-11 17:00:00
America/New_York
High-dimensional expansion in random geometric graphs
Abstract: High-dimensional expansion is a generalization of expansion to hypergraphs where the expansion of certain random walks are witnessed by local neighborhoods.This talk is on the topic of constructing natural probability distributions over sparse high-dimensional expanders (HDXes). On one hand, (standard/1-dimensional) sparse expanders are plentiful, since a constant degree random graph is an expander with high probability. On the other hand, most natural random models over hypergraphs fail to exhibit 2-dimensional expansion unless the average degree exceeds sqrt(number of vertices).However, sparse HDXes are known to exist due to algebraic and number theory based construction. We describe some progress towards constructing sparse HDXes based on random geometric graphs on the unit sphere.Based on joint work with Siqi Liu, Tselil Schramm, and Elizabeth Yang. https://arxiv.org/abs/2210.00158
32-G575
October 04
Add to Calendar
2023-10-04 16:00:00
2023-10-04 17:00:00
America/New_York
Finite-Sample Symmetric Mean Estimation with Fisher Information Rate
Abstract: We consider the problem of estimating the mean of a $1$-dimensional distribution $f$ given $n$ i.i.d. samples. When $f$ is known up to shift, the classical maximum-likelihood estimate (MLE) is known to be optimal in the limit as $n \to \infty$: it is asymptotically normal with variance matching the Cramer-Rao lower bound of $\frac{1}{n \mathcal I}$ where $\mathcal I$ is the Fisher information of $f$. Furthermore, [Stone; 1975] showed that the same convergence can be achieved even when $f$ is \emph{unknown} but \emph{symmetric}. However, these results do not hold for finite $n$, or when $f$ varies with $n$ and failure probability $\delta$.In this talk, I will present two recent works that together develop a finite sample theory for symmetric mean estimation in terms of Fisher Information. We show that for arbitrary (symmetric) $f$ and $n$, one can recover finite-sample guarantees based on the \emph{smoothed} Fisher information of $f$, where the smoothing radius decays with $n$.Based on joint works with Jasper C.H. Lee, Eric Price, and Paul Valiant.
32-D507
September 25
Add to Calendar
2023-09-25 16:00:00
2023-09-25 17:00:00
America/New_York
User-Level Differential Privacy With Few Examples Per User
Abstract: Previous work on user-level differential privacy (DP) [Ghazi et al., NeurIPS 2021; Bun et al., STOC 2023] obtained generic algorithms that work for various learning tasks. However, their focus was on the example-rich regime, where the users have so many examples that each user could themselves solve the problem. In this work we consider the example-scarce regime, where each user has only a few examples, and obtain the following results:For approximate-DP, we give a generic transformation of any item-level DP algorithm to a user-level DP algorithm. Roughly speaking, the latter gives a (multiplicative) savings of O_{ε,δ}(√m) in terms of the number of users required for achieving the same utility, where m is the number of examples per user. This algorithm, while recovering most known bounds for specific problems, also gives new bounds, e.g., for PAC learning.For pure-DP, we present a simple technique for adapting the exponential mechanism [McSherry & Talwar, FOCS 2007] to the user-level setting. This gives new bounds for a variety of tasks, such as private PAC learning, hypothesis selection, and distribution learning. For some of these problems, we show that our bounds are near-optimal.
32-G449
September 06
Add to Calendar
2023-09-06 16:00:00
2023-09-06 17:00:00
America/New_York
A unified framework for robustness, fairness, and collaboration in machine learning
Abstract: Pervasive needs for robustness, multi-agent collaboration, and fairness have motivated the design of new methods in research and development. However, these methods remain largely stylized, lacking a foundational perspective and provable performance. In this talk, I will introduce and highlight the importance of multi-objective learning as a unifying paradigm for addressing these needs. This paradigm aims to optimize complex and unstructured objectives from only a small amount of sampled data. I will also discuss how the multi-objective learning paradigm relates to the classical and modern considerations in machine learning broadly, introduce technical tools with versatile provable guarantees, and empirical evidence for its performance on a range of important benchmarks.Bio: Nika Haghtalab is an Assistant Professor in the Department of Electrical Engineering and Computer Sciences at UC Berkeley. She works broadly on the theoretical aspects of machine learning and algorithmic economics. Prof. Haghtalab's work builds theoretical foundations for ensuring both the performance of learning algorithms in presence of everyday economic forces and the integrity of social and economic forces that are born out of the use of machine learning systems. She received her Ph.D. from the Computer Science Department of Carnegie Mellon University, where her thesis won the CMU School of Computer Science Dissertation Award (ACM nomination) and the SIGecom Dissertation Honorable Mention. She is a co-founder of Learning Theory Alliance (LeT-All), a large-scale mentoring and community building initiative for the theory of machine learning community. Among her honors are an NSF CAREER award, NeurIPS and ICAPS best paper awards, an EC exemplary in AI track award, and several industry awards and fellowships.
32-G575
May 30
Add to Calendar
2023-05-30 16:00:00
2023-05-30 17:00:00
America/New_York
Algorithmic Applications of Hypergraph and Partition Containers
Abstract: We present a general method to convert algorithms into faster algorithms for almost-regular input instances. Informally, an almost-regular input is an input in which the maximum degree is larger than the average degree by at most a constant factor. This family of inputs vastly generalizes several families of inputs for which we commonly have improved algorithms, including bounded-degree inputs and random inputs. It also generalizes families of inputs for which we don't usually have faster algorithms, including regular-inputs of arbitrarily high degree and all very dense inputs. We apply our method to achieve breakthroughs in exact algorithms for several central NP-Complete problems including k-SAT, Graph Coloring, and Maximum Independent Set.Our main tool is the first algorithmic application of the relatively new Hypergraph Container Method (Saxton and Thomason 2015, Balogh, Morris and Samotij 2015). This recent breakthrough, which generalizes an earlier version for graphs (Kleitman and Winston 1982, Sapozhenko 2001), has been used extensively in recent years in extremal combinatorics. An important component of our work is a new generalization of (hyper-)graph containers to Partition Containers.
32-G575
April 27
Query-optimal estimation of unitary channels in diamond distance
University of Washington
Add to Calendar
2023-04-27 16:00:00
2023-04-27 17:00:00
America/New_York
Query-optimal estimation of unitary channels in diamond distance
Abstract: We will present an algorithm to learn an unknown unitary channel acting on a d-dimensional qudit to diamond-norm error ε, using O(d²/ε) applications of the unknown channel and only one qudit. This algorithm uses the optimal number of qudits and number of queries, even if one has access to the inverse or controlled versions of the unknown unitary. This improves over prior work, which achieves entanglement infidelity δ using O(d²/√δ) applications in parallel, thereby requiring Ω(d²) qudits. Based on joint work with Jeongwan Haah, Robin Kothari, and Ryan O'Donnell.
32-G449
April 26
Add to Calendar
2023-04-26 16:00:00
2023-04-26 17:00:00
America/New_York
Statistical inference with computational constraints
Abstract: The vast amount of digital data we create and collect has revolutionized many scientific fields and industrial sectors. However, despite our success in harnessing this transformative power of data, computational and societal trends emerging from the current practices of data science necessitate upgrading our toolkit for data analysis.In this talk, we discuss how practical considerations such as time and memory limits affect statistical inference tasks, with a particular focus on the problem of entropy estimation of a distribution by streaming over i.i.d. samples from it. We determine how bounded memory affects the number of samples we need to solve this problem.Recent work by Acharya et al. (NeurIPS 2019) showed how to estimate the entropy of a distribution D over an alphabet of size k up to \epsilon additive error by streaming over (k/\epsilon^3)⋅polylog(1/\epsilon) i.i.d. samples and using only O(1) words of memory. In this work, we propose a new constant memory scheme that reduces the sample complexity to (k/\epsilon^2)⋅polylog(1/\epsilon). We conjecture that this is optimal up to polylog(1/\epsilon) factors.Based on: https://arxiv.org/abs/2205.09804Joint work with: Andrew McGregor, Jelani Nelson, Erik Waingarten
32-G575
April 19
Add to Calendar
2023-04-19 16:00:00
2023-04-19 17:00:00
America/New_York
Credible Decentralized Exchange Design via Verifiable Sequencing Rules
Abstract: Trading on decentralized exchanges has been one of the primary use cases for permissionless blockchains with daily trading volume exceeding billions of U.S.~dollars. In the status quo, users broadcast transactions they wish to execute in the exchange and miners are responsible for composing a block of transactions and picking an execution ordering --- the order in which transactions execute in the exchange. Due to the lack of a regulatory framework, it is common to observe miners exploiting their privileged position by front-running transactions and obtaining risk-fee profits. Indeed, the Flashbots service institutionalizes this exploit, with miners auctioning the right to front-run transactions. In this work, we propose to modify the interaction between miners and users and initiate the study of {\em verifiable sequencing rules}. As in the status quo, miners can determine the content of a block; however, they commit to respecting a sequencing rule that constrains the execution ordering and is verifiable (there is a polynomial time algorithm that can verify if the execution ordering satisfies such constraints). Thus in the event a miner deviates from the sequencing rule, anyone can generate a proof of non-compliance. We ask if there are sequencing rules that limit price manipulation from miners in a two-token liquidity pool exchange. Our first result is an impossibility theorem: for any sequencing rule, there is an instance of user transactions where the miner can obtain non-zero risk-free profits. In light of this impossibility result, our main result is a verifiable sequencing rule that provides execution price guarantees for users. In particular, for any user transaction $A$, it ensures that either (1) the execution price of $A$ is at least as good as if $A$ was the only transaction in the block, or (2) the execution price of $A$ is worse than this ``standalone'' price and the miner does not gain when including $A$ in the block. Our framework does not require users to use countermeasures against predatory trading strategies, for example, set limit prices or split large transactions into smaller ones. This is likely to improve user experience relative to the status quo.Joint work with David C. Parkes. To appear at STOC 2023.
32-G575
April 12
Bounded independence plus noise and derandomization
Harvard University
Add to Calendar
2023-04-12 16:00:00
2023-04-12 17:00:00
America/New_York
Bounded independence plus noise and derandomization
Abstract:k-wise independent distributions have been a fundamental object in algorithm design and derandomization since the late 70s. They are defined as distributions on n bits that are uniform on every k coordinates. Despite their importance, they have limitations as a pseudorandom object. A recent line of work has shown that adding noise to k-wise independent distributions can be unexpectedly powerful, leading to new constructions of pseudorandom generators (PRGs) for various classes of functions. In this talk, I will present an overview of the latest research on the topic, including various constructions and analyses of PRGs, and how it is related to coding theory and Boolean function analysis.
32-G575
March 23
From Robustness to Privacy and Back
Northeastern University
Add to Calendar
2023-03-23 16:00:00
2023-03-23 17:00:00
America/New_York
From Robustness to Privacy and Back
Abstract: We study the relationship between two desiderata of algorithms in statistical inference and machine learning—differential privacy and robustness to adversarial data corruptions. Their conceptual similarity was first observed by Dwork and Lei (STOC 2009), who observed that private algorithms satisfy robustness, and gave a general method for converting robust algorithms to private ones. However, all general methods for transforming robust algorithms into private ones lead to suboptimal error rates. Our work gives the first black-box transformation that converts any adversarially robust algorithm into one that satisfies pure differential privacy. Moreover, we show that for any low-dimensional estimation task, applying our transformation to an optimal robust estimator results in an optimal private estimator. Thus, we conclude that for any low-dimensional task, the optimal error rate for ε-differentially private estimators is essentially the same as the optimal error rate for estimators that are robust to adversarially corrupting 1/ε training samples. We apply our transformation to obtain new optimal private estimators for several high-dimensional tasks, including Gaussian (sparse) linear regression and PCA. Finally, we present an extension of our transformation that leads to approximate differentially private algorithms whose error does not depend on the range of the output space, which is impossible under pure differential privacy.Joint work with Hilal Asi and Jonathan Ullman.
32-D507
March 15
Add to Calendar
2023-03-15 16:00:00
2023-03-15 17:00:00
America/New_York
Sampling with Barriers: Faster Mixing via Lewis Weights
Abstract: We analyze Riemannian Hamiltonian Monte Carlo (RHMC) for sampling a polytope defined by m inequalities in R^n endowed with the metric defined by the Hessian of a self-concordant convex barrier function. We use a hybrid of the p-Lewis weight barrier and the standard logarithmic barrier and prove that the mixing rate is bounded by \tilde{O}(m1/3n 4/3), improving on the previous best bound of \tilde{O}(mn2/3), based on the log barrier. Our analysis overcomes several technical challenges to establish this result, in the process deriving smoothness bounds on Hamiltonian curves and extending self-concordance notions to the infinity norm; both properties appear to be of independent interest.
32-G575
March 09
A data-centric view on reliable generalization: From ImageNet to LAION-5B
University of Washington
Add to Calendar
2023-03-09 15:00:00
2023-03-09 16:00:00
America/New_York
A data-centric view on reliable generalization: From ImageNet to LAION-5B
Abstract: Researchers have proposed many algorithms to make neural networks more reliable under distribution shift, yet there is still large room for improvement. Are better training algorithms or training data the more promising way forward? In this talk, we study this question in the context of OpenAI’s CLIP model for learning from image-text data.First, we survey the current robustness landscape based on a large-scale experimental study involving more than 200 different models and test conditions. The CLIP models stand out with unprecedented robustness on multiple challenging distribution shifts. To further improve CLIP, we then introduce new algorithms for reliably fine-tuning models by interpolating the weights of multiple models. Next, we investigate the cause of CLIP’s robustness via controlled experiments to disentangle the influence of language supervision and training distribution. While CLIP leveraged large scale language supervision for the first time, its robustness actually comes from the pre-training dataset.
32-G449
March 08
Add to Calendar
2023-03-08 16:00:00
2023-03-08 17:00:00
America/New_York
New Directions in Algorithms with Predictions: Learning and Privacy
Abstract. A burgeoning paradigm in algorithm design is learning-augmented algorithms, or algorithms with predictions, where methods can take advantage of a (possibly imperfect) prediction about their instance. While past work has focused on using predictions to improve competitive ratios and runtime, this talk addresses a different, salient question: how do we learn the predictions themselves? We introduce an approach for co-designing learning-augmented algorithms with their own custom learning algorithms, with the crucial step being to optimize nice surrogate losses bounding the algorithms’ costs. This leads to improved sample complexity bounds for several learning-augmented graph algorithms and the first learning-theoretic guarantees for page migration with predictions, among other contributions. We also instantiate these ideas on the new direction of learning-augmented private algorithms, where the goal is to reduce utility loss due to privacy rather than runtime. Our approach drives numerous insights on how to robustly incorporate external information to release better statistics of sensitive datasets, which we verify empirically on the task of multiple quantile release.
32-G575