Add to Calendar
2025-12-10 16:00:00
2025-12-10 17:00:00
America/New_York
Which Algorithms Have Tight Generalization Bounds?
Generalization measures (GMs) are a central tool for providing performance guarantees and informing algorithmic design. Yet, most such bounds are known to be loose (or even vacuous) in practise. In a series of works [1, 2], we focus on GMs in the overparametrized supervised learning setting, where we show that this looseness is unavoidable due to fundamental lower bounds. Notably, these lower bounds hold on average over finite collections of distributions, and with numerically appreciable values.For GMs that are computed solely from the training sample but depend on neither the algorithm nor distribution, we show non-tightness across a large fraction of (algorithm, distribution) combinations.For GMs that can also depend on the algorithm, we show that there can be a trade-off between the algorithm’s ability to learn and the ability to verify the algorithm’s learning success with a GM.Next, we study algorithm-dependent GMs for algorithms that admit a natural notion of algorithmic implicit bias. There, non-tightness of GMs provably occurs whenever the underlying distribution class is rich enough, which is the case for example when learning VC-classes.Lastly, we show that a certain notion of algorithmic stability is sufficient for the existence of tight GMs.Joint work with Ido Nachum (University of Haifa), Jonathan Shafer (MIT), and Michael Gastpar (EPFL).[1]: ICLR 2024, see https://arxiv.org/abs/2309.13658[2]: Neurips 2025 (Spotlight), see https://arxiv.org/abs/2410.01969
TBD
December 10
December 03
Add to Calendar
2025-12-03 16:00:00
2025-12-03 17:00:00
America/New_York
Settling the Pass Complexity of Approximate Matchings in Dynamic Graph Streams
AbstractIn the dynamic streaming model, an $n$-vertex input graph is defined through a sequence of edge insertions and deletions in a stream. The algorithms are allowed to process this stream in multiple passes while using O(n \poly\log{(n)}) space. This model has been introduced in the seminal work of Ahn, Guha, and McGregor (AGM) in 2012 and has been studied extensively since.In this talk, we focus on approximating maximum matching in the dynamic stream model. An $O(1)$-approximation algorithm in $O(\log n)$ passes was already introduced by [AGM12], but improving the number of passes has remained elusive. We give a randomized sketching based algorithm that achieves an $O(1)$-approximation in only $O(\log \log n)$-passes and $O(n \poly \log n)$ space, which exponentially improves the state-of-art. Using standard techniques, the approximation ratio of this algorithm can be improved to (1+eps)-approximation for any constant eps > 0 with asymptotically the same space and number of passes.Furthermore, we prove the first multi-pass lower bound for this problem, showing that $\Omega(\log \log n)$ passes are also necessary for any algorithm which finds an $O(1)$-approximation to maximum matching in $O(n \poly \log n)$ space.Our upper and lower bounds collectively settle the pass-complexity of this fundamental problem in the dynamic streaming model. The talk however will primarily focus on the algorithmic part of our results.This is joint work with Sepehr Assadi, Christian Konrad, Kheeran K. Naidu, and Janani Sundaresan.Speaker BioSoheil Behnezhad is an Assistant Professor of Computer Science at Northeastern University. Before joining Northeastern, he was a Motwani Postdoctoral Fellow at Stanford University. He received his PhD from the University of Maryland and his BSc from Sharif University. His research focuses on theoretical computer science, particularly the foundations of big data algorithms---including sublinear-time, parallel, streaming, and dynamic algorithms---as well as graph algorithms. He has received Best Paper Awards at SODA 2023 and STOC 2025, and his work is supported by an NSF CAREER Award and a Google Research Award.
TBD
November 19
Add to Calendar
2025-11-19 16:00:00
2025-11-19 17:00:00
America/New_York
Fast Algorithms for Graph Arboricity and Related Problems
We give an algorithm for finding the arboricity of a weighted, undirected graph, defined as the minimum number of spanning forests that cover all edges of the graph, in \sqrt{n} m^{1+o(1)} time. This improves on the previous best bound of O(nm) for weighted graphs and O(m^{3/2}) for unweighted graphs (Gabow 1995) for this problem. The running time of our algorithm is dominated by a logarithmic number of calls to a directed global minimum cut subroutine—if the running time of the latter problem improves to m^{1+o(1)} (thereby matching the running time of maximum flow), the running time of our arboricity algorithm would improve further to m^{1+o(1)}.As an application, we also give a new algorithm for computing the entire cut hierarchy—laminar multiway cuts with minimum cut ratio in recursively defined induced subgraphs—in m n^{1+o(1)} time. For the cut hierarchy problem, the previous best bound was O(n^2 m) for weighted graphs and O(n m^{3/2}) for unweighted graphs.This is joint work with Ruoxu Cen, Henry Fleischmann, Jason Li, and Debmalya Panigrahi.
TBD
November 12
Low-Query Locally Testable Codes
University of Haifa
Add to Calendar
2025-11-12 16:00:00
2025-11-12 17:00:00
America/New_York
Low-Query Locally Testable Codes
Locally testable codes (LTCs) are a special kind of error correcting codes where the receiver can correctly detect, with high probability, whether the received data was significantly corrected by reading just a few of its letters (chosen at random according to some distribution). This is useful because decoding very corrupted data may be time-consuming --- it is better to ask for retransmission in such a case. However, LTCs are usually motivated because of their connection to probabilistically checkable proofs (PCPs). While good error correcting codes were well-known to exist almost since the topic was founded, the existence of good LTCs remained open for a long time. In fact, LTCs were even shown to be constrained, e.g., Ben-Sasson, Goldreich and Sudan showed that there are no good 2-query LTCs on a binary alphabet, where "2-query" means that one is allowed to read only 2 letters from the received word. Surprisingly, Dinur-Evra-Lubotzky-Livne-Mozes and Panteleev-Kalachev (independently) showed recently that good LTCs do exist. In a more recent work with Tali Kaufman, we showed that there are even good 2-query LTCs (with a huge alphabet), and that the DELLM construction can be explained by means of cosystolic expansion (of sheaves). Finally, in a recent (still unpublished) work with Stav Lazarovich, we show that good 2-query LTCs exist for every alphabet of 3 or more letters, so the restriction proved by Ben-Sasson, Goldreich and Sudan is the only one.I will survey the above works and then explain some of the key ideas of my work with Kaufman, namely, the use of sheaves (a.k.a. local systems) on cell complexes to construct codes with a 2-query tester, and the use of a novel (almost-)local criterion to establish their desired properties.
TBD
November 05
Add to Calendar
2025-11-05 16:00:00
2025-11-05 17:00:00
America/New_York
Zeroth-order log-concave sampling: Uniform sampling from convex bodies
Since the development of the first randomized polynomial-time algorithm for volume computation by Dyer, Frieze, and Kannan in 1989, convex-body sampling has been a central problem at the intersection of algorithms, geometry, and probability. A major milestone came in 1997, when Kannan, Lovász, and Simonovits analyzed the Ball Walk and formulated the influential KLS conjecture. This was extended to log-concave distributions by Lovász and Vempala in 2006, and further accelerated by Cousins and Vempala in 2015 through warm-start generation techniques.In this talk, I will present new and principled approaches that understand, streamline, and improve these advances. First, I propose a simple variant of the proximal sampler that achieves the query complexity with matched Rényi orders between the initial warmness and output guarantee. Then, I introduce a simple annealing scheme that produces a warm start in q-Rényi divergence. To relay a Rényi warmness across the annealing scheme, I establish hypercontractivity under simultaneous heat flow and translate it into an improved mixing guarantee for the proximal sampler under a logarithmic Sobolev inequality. These results extend naturally to general log-concave distributions accessible via evaluation oracles, incurring additional quadratic queries.The talk will be based on joint work with Santosh Vempala.
TBD
October 29
The Mysterious Query Complexity of Tarski Fixed Points
Columbia University
Add to Calendar
2025-10-29 16:00:00
2025-10-29 17:00:00
America/New_York
The Mysterious Query Complexity of Tarski Fixed Points
Tarski's fixed point theorem has extensive applications across many fields, including verification, semantics, game theory, and economics. Recently, the complexity of finding a Tarski fixed point has attracted significant attention.In this talk, I will introduce the problem of computing a Tarski fixed point over a grid $[N]^d$, highlight our recent progress toward a better understanding of it, and discuss the surprising journey and the mysteries surrounding its complexity.Based on joint work with Xi Chen and Mihalis Yannakakis.
TBD
October 27
Fast Mixing of 1D Quantum Gibbs Samplers at All Temperatures
Anthony (Chi-Fang) Chen
UC Berkeley
Add to Calendar
2025-10-27 15:00:00
2025-10-27 16:00:00
America/New_York
Fast Mixing of 1D Quantum Gibbs Samplers at All Temperatures
Recently, quantum computing analogs of classical Gibbs samplers have been introduced—quantum Markov chains that generalize Glauber or Metropolis dynamics, and serve as models of nature’s thermalization process. In this work, we show that every one-dimensional quantum Hamiltonian with short-range interactions admits a quantum Gibbs sampler with a system-size–independent, optimal spectral gap at all finite temperatures. As a consequence, their Gibbs states can be prepared in polylogarithmic depth and exhibit exponential clustering of correlations, extending the classic result of Araki (1969). (Joint work https://arxiv.org/abs/2510.08533 with Thiago Bergamaschi.)
TBD
October 22
Add to Calendar
2025-10-22 16:00:00
2025-10-22 17:00:00
America/New_York
Fault-Tolerance in Buy-at-Bulk and Hop-Constrained Network Design
Buy-at-bulk network design is a classical and practically motivated problem, in which the goal is to construct a low-cost network that supports multi-commodity flow between given node pairs. In this problem, the cost of purchasing capacity on an edge is given by a sub-additive function, thus capturing settings that exhibit economies of scale. In the fault-tolerant setting, the goal is to protect against node or edge failures by instead sending flow for each demand pair on k disjoint paths. Despite substantial work addressing various special cases, there is still no nontrivial approximation algorithm known for fault-tolerant buy-at-bulk, even for protecting against a single edge failure (k=2)! In this talk, I will highlight some recent progress in buy-at-bulk network design via a connection to hop-constrained network design. Leveraging recent progress in hop-constrained oblivious routing, we obtain several polylogarithmic approximation algorithms for hop-constrained network design problems, including a relaxed notion of fault-tolerance that is of independent interest. I will conclude by briefly discussing new algorithmic techniques towards resolving the fault-tolerant buy-at-bulk problem.This talk is based on joint work with Chandra Chekuri.
TBD
October 20
Add to Calendar
2025-10-20 16:00:00
2025-10-20 17:00:00
America/New_York
Memory as a lens to understand learning and optimization
What is the role of memory in learning and optimization? The optimal convergence rates (measures in terms of the number of oracle queries or samples needed) for various optimization problems are achieved by computationally expensive optimization techniques, such as second-order methods and cutting-plane methods. We will discuss if simpler, faster and memory-limited algorithms such as gradient descent can achieve these optimal convergence rates for the prototypical optimization problem of minimizing a convex function with access to a gradient. Our results hint at a perhaps curious dichotomy---it is not possible to significantly improve on the convergence rate of known memory efficient techniques (which are linear-memory variants of gradient descent for many of these problems) without using substantially more memory (quadratic memory for many of these problems). Therefore memory could be a useful discerning factor to provide a clear separation between 'efficient' and 'expensive' techniques. This perspective can also be applied to understand mechanisms which transformers use to solve certain algorithmic tasks. We will show empirical evidence that transformers learn to achieve second-order convergence rates for solving linear-regression tasks, leveraging some of the theory of optimization and memory-limited learning.This is mostly based on joint work with Annie Marsden, Aaron Sidford, Gregory Valiant, Deqing Fu, Tian-Qi Chen and Robin Jia.
TBD
October 15
Quality Control on Random Graphs in Sublinear Time
Harvard University
Add to Calendar
2025-10-15 16:00:00
2025-10-15 17:00:00
America/New_York
Quality Control on Random Graphs in Sublinear Time
Many algorithms are designed to perform well on random instances. However, when running such an algorithm on a specific input, can we trust its output? We define a new class of problems whose formulation allows algorithms to benefit from the randomness of instances while remaining reliable to use on any given input. We call these “quality control” problems, specified by a distribution D and a real-valued quality function Q. We say that an input x is “high quality” if Q(x) is approximately 1, and assume that samples from D are high quality with high probability. The task is to accept x if it is drawn from D and reject x if Q(x) is far from 1.We study this problem in the sublinear setting, where inputs are graphs and the algorithm has access to adjacency matrix queries. Focusing on the case where D is a (sparse) Erdős–Rényi random graph model and Q is proportional to the number of k-cliques in a graph, we show that quality control can be solved superpolynomially faster than the related task of approximating Q in worst-case graphs in the sublinear access model. Joint work with Ronitt Rubinfeld (MIT) and Madhu Sudan (Harvard).
TBD
October 08
Add to Calendar
2025-10-08 16:00:00
2025-10-08 17:00:00
America/New_York
Approximating High-Dimensional Earth Mover’s Distance as Fast as Closest Pair
We give a reduction from $(1+\epsilon)$-approximate Earth Mover's Distance (EMD) to $(1+\epsilon)$-approximate Closest Pair. As a consequence, we improve the fastest known approximation algorithm for high-dimensional EMD. Here, given $p\in [1, 2]$ and two sets of $n$ points $X,Y \subset (\R^d,\ell_p)$, their EMD is the minimum cost of a perfect matching between $X$ and $Y$, where the cost of matching two vectors is their $\ell_p$ distance. Further, Closest Pair is the basic problem of finding a pair of points realizing $\min_{x \in X, y\in Y} ||x-y||_p$.
TBD
October 01
Add to Calendar
2025-10-01 16:00:00
2025-10-01 17:00:00
America/New_York
Faster Mixing of the Jerrum-Sinclair Chain
We show that the Jerrum-Sinclair Markov chain on matchings mixes in time $\widetilde{O}(\Delta^2 m)$ on any graph with $n$ vertices, $m$ edges, and maximum degree $\Delta$, for any constant edge weight $\lambda>0$.For general graphs with arbitrary, potentially unbounded $\Delta$, this provides the first improvement over the classic $\widetilde{O}(n^2 m)$ mixing time bound of Jerrum and Sinclair~(1989) and Sinclair~(1992).To achieve this, we develop a general framework for analyzing mixing times, combining ideas from the classic canonical path method with the "local-to-global" approaches recently developed in high-dimensional expanders, introducing key innovations to both techniques.Based on joint work with Weiming Feng (The University of Hong Kong); Zhe Ju, Tianshun Miao, Yitong Yin, and Xinyuan Zhang (Nanjing University).
TBD
September 24
Add to Calendar
2025-09-24 16:00:00
2025-09-24 17:00:00
America/New_York
Learning and Incentives in Human–AI Collaboration
As AI systems become more capable, a central challenge is designing them to work effectively with humans. Game theory and online learning provide a natural toolkit for analyzing such interactions, where both humans and algorithms adapt strategically over time.Consider a doctor who wants to diagnose a patient and can consult an AI. Suppose, to start, that the AI is aligned with the doctor’s goal of accurate diagnosis. Even under this cooperative assumption, the doctor and AI may each hold only partial and incomparable knowledge, and they may not have a shared prior. How can we guarantee that their combined predictions are strictly better than relying on either alone, without assuming shared priors or stochastic data? Importantly, they may face many such cases together over time, and this repeated interaction enables richer forms of collaboration. I will present learning-theoretic results showing that collaboration is possible in a fully distribution-free repeated prediction setting, with protocols that ensure regret bounds against benchmarks defined on the joint information of both the human and the AI.In the second part, I will return to the alignment assumption itself. What if the AI is not necessarily aligned with the doctor’s goals? For example, what if the AI model is developed by a pharmaceutical company that has a preference for the doctor prescribing their own drugs? To address such misaligned incentives, it is natural to consider a setting where the doctor has access to multiple AI models, each offered by a different provider. Although each provider may be misaligned, under a mild average alignment assumption—that the doctor’s utility lies in the convex hull of the providers’ utilities—we show that in Nash equilibrium of the competition among providers, the doctor can achieve the same outcomes they would if a perfectly aligned provider were present. The analysis builds on ideas from Bayesian persuasion and information design, adapted to settings with competing AI providers.This talk is based on the following works: Tractable Agreement Protocols (STOC’25), Collaborative Prediction: Tractable Information Aggregation via Agreement (in submission), and Emergent Alignment via Competition (in submission).
TBD
September 19
Add to Calendar
2025-09-19 14:00:00
2025-09-19 15:00:00
America/New_York
Distribution Learning with Advice by Arnab Bhattacharyya
Abstract: We revisit the problem of distribution learning within the framework of learning-augmented algorithms. In this setting, we explore the scenario where a probability distribution is provided as potentially inaccurate advice on the true, unknown distribution. Our objective is to develop learning algorithms whose sample complexity decreases as the quality of the advice improves, thereby surpassing standard learning lower bounds when the advice is sufficiently accurate. Specifically, we demonstrate that this outcome is achievable for the problem of learning a multivariate Gaussian distribution N(μ, Σ) in the PAC learning setting. Classically, in the advice-free setting, Θ~(d²/ε²) samples are sufficient and worst case necessary to learn d-dimensional Gaussians up to TV distance ε with constant probability. When we are additionally given a parameter T as advice, we show that O~(d²⁻ᵝ/ε²) samples suffices whenever ‖√T⁻¹Σ√T⁻¹−I‖₁< ε d¹⁻ᵝ (where ‖ ‖₁ denotes the entrywise l1-norm) for any , yielding a polynomial improvement over the advice-free setting. We also show similar results for product distributions over the hypercube.Joint work with Davin Choo (Harvard), Philips George John (NUS), and Themis Gouleakis (NTU).
TBD
September 17
Add to Calendar
2025-09-17 14:00:00
2025-09-17 15:00:00
America/New_York
Metric Embeddings with Outliers
We study the design of embeddings into Euclidean space with outliers. Given a metric space (X, d) and an integer k, the goal is to embed all but k points in X (called the “outliers”) into ℓ2 with the smallest possible distortion c. Finding the optimal distortion c for a given outlier set size k, or alternately the smallest k for a given target distortion c are both NP-hard problems. We consider bi-criteria approximations. Our main result is a polynomial time algorithm that approximates the outlier set size to within an O(log^2 k) factor and the distortion to within a constant factor.We also show that the techniques we use in designing outlier embeddings into Euclidean space have the potential to extend to a wider variety of target embedding spaces. In particular, we consider probabilistic outlier embeddings, which involve probabilistic embeddings and only require that non-outlier pairs have low distortion in expectation over the randomness of the embedding. We show that for probabilistic outlier embeddings into hierarchically separated trees (HSTs), there is a polynomial time algorithm that takes in a finite metric space with a (k,c) probabilistic outlier embedding and produces a probabilistic outlier embedding with O(log^2 k) outliers and O(c) distortion.
TBD