October 03

Add to Calendar 2019-10-03 16:00:00 2019-10-03 17:00:00 America/New_York Does Learning Require Memorization? A Short Tale about a Long Tail Abstract:Learning algorithms based on deep neural networks are well-known to (nearly) perfectly fit the training set and fit well even the random labels. The reasons for this tendency to memorize the labels of the training data are not well understood.We provide a simple model for prediction problems in which such memorization is necessary for achieving close-to-optimal generalization error. In our model, data is sampled from a mixture of subpopulations and the frequencies of these subpopulations are chosen from some prior. Our analysis demonstrates that memorization becomes necessary whenever the frequency prior is long-tailed. Image and text data are known to follow such distributions and therefore our results establish a formal link between these empirical phenomena. We complement the theoretical results with experiments on several standard benchmarks showing that memorization is an essential part of deep learning.Based on https://arxiv.org/abs/1906.05271 and an ongoing work with Chiyuan Zhang.Bio:Vitaly Feldman is a research scientist at Google working on design and theoretical analysis of machine learning algorithms. His recent research interests include stability-based and information-theoretic tools for analysis of generalization, privacy-preserving learning, and adaptive data analysis. Vitaly holds a PhD from Harvard (2006) and was previously a research scientist at IBM Research - Almaden (2007-2017). He serves as a director on the steering committee of the Association for Computational Learning and was a program co-chair for COLT 2016. 32-G449 (Stata Center - Patil/Kiva Conference Room) Belfer sarah_donahue@hks.harvard.edu

October 17

Challenges in Reliable Machine Learning

Kamalika Chaudhuri
University of California, San Diego
Add to Calendar 2019-10-17 16:15:00 2019-10-17 17:15:00 America/New_York Challenges in Reliable Machine Learning Abstract:As machine learning is increasingly used in real applications, there is a need for reliable and robust methods. In this talk, we will discuss two such challenges that arise in reliable machine learning. The first is sample selection bias, where training data is available from a distribution conditioned on a sample selection policy, but the resultant classifier needs to be evaluated on the entire population. We will show how we can use active learning to get a small amount of labeled data from the entire population that can be used to correct this kind of sample selection bias. The second is robustness to adversarial examples -- slight strategic perturbations of legitimate test inputs that cause misclassification. We next look at adversarial examples in the context of a simple non-parametric classifier -- the k-nearest neighbor classifier, and look at its robustness properties. We provide bounds on its robustness as a function of k, and propose a more robust 1-nearest neighbor classifier.Joint work with Songbai Yan, Tara Javidi, Yaoyuan Yang, Cyrus Rastchian, Yizhen Wang and Somesh Jha.Bio:Kamalika Chaudhuri received a Bachelor of Technology from the Indian Institute of Technology, Kanpur, and a PhD in Computer Science from the University of California, Berkeley. Currently, she is an Associate Professor at the University of California, San Diego. She is a recipient of the NSF Career Award, Hellman Faculty Fellowship, and Google and Bloomberg Faculty Awards. Kamalika's research is on the foundations of trustworthy machine learning -- which includes problems such as learning from sensitive data while preserving privacy, learning under sampling bias, and in the presence of an adversary. She is also broadly interested in a number of topics in learning theory, such as non-parametric methods, online learning, and active learning. 32-G449 (Stata Center, Patil/Kiva Conference Room) Belfer sarah_donahue@hks.harvard.edu

October 24

Add to Calendar 2019-10-24 16:30:00 2019-10-24 17:30:00 America/New_York Neural Stochastic Differential Equations for Sparsely-sampled Time Series Abstract: Much real-world data is sampled at irregular intervals, but most time series models require regularly-sampled data. Continuous-time latent variables models can handle address this problem, but until now only deterministic models, such as latent ODEs, were efficiently trainable by backprop. We generalize the adjoint sensitivities method to SDEs, constructing an SDE that runs backwards in time and computes all necessary gradients, along with a general algorithm that allows SDEs to be trained by backpropgation with constant memory cost. We also give an efficient algorithm for gradient-based stochastic variational inference in function space, all with the use of adaptive black-box SDE solvers. Finally, we'll show initial results of applying latent SDEs to time series data, and discuss prototypes of infinitely-deep Bayesian neural networks.Bio:David Duvenaud is an assistant professor in computer science and statistics at the University of Toronto. He holds a Canada Research Chair in generative models. His postdoctoral research was done at Harvard University, where he worked on hyperparameter optimization, variational inference, and chemical design. He did his Ph.D. at the University of Cambridge, studying Bayesian nonparametrics with Zoubin Ghahramani and Carl Rasmussen. David spent two summers in the machine vision team at Google Research, and also co-founded Invenia, an energy forecasting and trading company. David is a founding member of the Vector Institute and a Faculty Fellow at ElementAI. 35-225 Belfer sarah_donahue@hks.harvard.edu



This event has been cancelled

October 29

Add to Calendar 2019-10-29 11:00:00 2019-10-29 12:00:00 America/New_York David Spiegelhalter: Communicating uncertainty about facts, numbers and science "Communicating uncertainty about facts, numbers and science"The claim of a ‘post-truth’ society, in which emotional responses trump balanced consideration of evidence, presents a strong challenge to those who value quantitative and scientific evidence: how can we communicate risks and unavoidable scientific uncertainty in a transparent and trustworthy way?Communication of quantifiable risks has been well-studied, leading to recommendations for using an expected frequency format. But deeper uncertainty about facts, numbers, or scientific hypotheses needs to be communicated without losing trust and credibility. This is an empirically researchable issue, and I shall describe some current randomised experiments concerning the impact on audiences of alternative verbal, numerical and graphical means of communicating uncertainty. Available evidence may often not permit a quantitative assessment of uncertainty, and I will also examine scales being used to summarise degrees of ‘confidence’ in conclusions, in terms of the quality of the research underlying the whole assessment.Professor Sir David Spiegelhalter is Chair of the Winton Centre for Risk and Evidence Communication in the University of Cambridge, which aims to improve the way that statistical evidence is used by health professionals, patients, lawyers and judges, media and policy-makers. He advises organisations and government agencies on risk communication and is a regular media commentator on statistical issues, with a particular focus on communicating uncertainty. His background is in medical statistics, and he has over 200 refereed publications and is co-author of 6 textbooks, as well as The Norm Chronicles (with Michael Blastland), and Sex by Numbers. He works extensively with the media, and presented the BBC4 documentaries “Tails you Win: the Science of Chance”, the award-winning “Climate Change by Numbers”, and in 2011 came 7 th in an episode of BBC1’s Winter Wipeout. He was elected Fellow of the Royal Society in 2005, and knighted in 2014 for services to medical statistics. He was President of the Royal Statistical Society for 2017-2018. His bestselling book, The Art of Statistics, was published in March 2019. He is @d_spiegel on Twitter, and his home page is http://www.statslab.cam.ac.uk/~david/http://evite.me/xydVhP9GHQ Seminar Room D463 (Star) Belfer sarah_donahue@hks.harvard.edu

October 31

Add to Calendar 2019-10-31 16:30:00 2019-10-31 17:30:00 America/New_York Generative Modeling by Estimating Gradients of the Data Distribution Abstract:Existing generative models are typically based on explicit representations of probability distributions (e.g., autoregressive or VAEs) or implicit sampling procedures (e.g., GANs). We propose an alternative approach based on modeling directly the vector field of gradients of the data distribution (scores). Our framework allows flexible energy-based model architectures, requires no sampling during training or the use of adversarial training methods. Using annealed Langevin dynamics, we produces samples comparable to GANs on MNIST, CelebA and CIFAR-10 datasets, achieving a new state-of-the-art inception score of 8.91 on CIFAR-10. Finally, I will discuss challenges in evaluating bias and generalization in generative models.Bio:Stefano Ermon is an Assistant Professor of Computer Science in the CS Department at Stanford University, where he is affiliated with the Artificial Intelligence Laboratory, and a fellow of the Woods Institute for the Environment. His research is centered on techniques for probabilistic modeling of data, inference, and optimization, and is motivated by a range of applications, in particular ones in the emerging field of computational sustainability. He has won several awards, including four Best Paper Awards (AAAI, UAI and CP), a NSF Career Award, ONR and AFOSR Young Investigator Awards, a Sony Faculty Innovation Award, an AWS Machine Learning Award, a Hellman Faculty Fellowship, Microsoft Research Fellowship, and the IJCAI Computers and Thought Award. Stefano earned his Ph.D. in Computer Science at Cornell University in 2015. 35-225 Belfer sarah_donahue@hks.harvard.edu

November 07

Add to Calendar 2019-11-07 16:15:00 2019-11-07 17:15:00 America/New_York Advancements in Graph Neural Networks ABSTRACT: Machine learning on graphs is an important and ubiquitous task with applications ranging from drug design to friendship recommendation in social networks. The primary challenge in this domain is finding a way to represent, or encode, graph structure so that it can be easily exploited by machine learning models. In this talk I will discuss recent advancements in the field of Graph Neural Networks that automatically learn to encode graph structure into low-dimensional embeddings, using techniques based on deep learning. I will provide a conceptual overview of key advancements in this area of representation learning on graphs, including graph convolutional networks and their representational power. We will also discuss applications to web-scale recommender systems, healthcare, and knowledge representation and reasoning.BIO:Jure Leskovec is Associate Professor of Computer Science at Stanford University, Chief Scientist at Pinterest, and investigator at Chan Zuckerberg Biohub. His research focuses on machine learning and data mining with graphs, a general language for describing social, technological and biological systems. Computation over massive data is at the heart of his research and has applications in computer science, social sciences, marketing, and biomedicine. This research has won several awards including a Lagrange Prize, Microsoft Research Faculty Fellowship, the Alfred P. Sloan Fellowship, and numerous best paper and test of time awards. Leskovec received his bachelor's degree in computer science from University of Ljubljana, Slovenia, PhD in machine learning from Carnegie Mellon University and postdoctoral training at Cornell University. 32-G449 (Patil/Kiva Conference Room) Belfer sarah_donahue@hks.harvard.edu

November 21

Add to Calendar 2019-11-21 15:00:00 2019-11-21 17:00:00 America/New_York Synthetic Control (NeurIPS 2019 tutorial) Abstract:The synthetic control method, introduced in Abadie and Gardeazabal(2003), has emerged as a popular empirical methodology for estimating a causal effects with observational data, when the “gold standard” of a randomized control trial is not feasible. In a recent survey on causal inference and program evaluation methods in economics, Athey and Imbens (2015) describe the synthetic control method as “arguably the most important innovation in the evaluation literature in the last fifteen years”. While many of the most prominent application of the method, as well as its genesis, were initially circumscribed to the policy evaluation literature, synthetic controls have found their way more broadly to social sciences, biological sciences, engineering and even sports. However, only recently, synthetic controls have been introduced to the machine learning community through its natural connection to matrix and tensor estimation in Amjad, Shah and Shen (2017) as well as Amjad, Misra, Shah and Shen (2019).In this tutorial, we will survey the rich body of literature on methodical aspects, mathematical foundations and empirical case studies of synthetic controls. We willprovide guidance for empirical practice, with special emphasis on feasibility and data requirements, and characterize the practical settings where synthetic controls may be useful and those where they may fail. We will describe empirical case studies from policy evaluation, retail, and sports. Moreover, we will discuss mathematical connections of synthetic controls to matrix and tensor estimation, high dimensional regression, and time series analysis. Finally, we will discuss how synthetic controls are likely to be instrumental in the next wave of development in reinforcement learning using observational data.Bios:Devavrat Shah is a Professor with the department of Electrical Engineering and Computer Science and Director of Statistics and Data Science at the Massachusetts Institute of Technology. His current research interests are at the interface of Statistical Inference and Social Data Processing. His work has been recognized through prize paper awards in Machine Learning, Operations Research and Computer Science, as well as career prizes 2008 ACM Sigmetrics Rising Star Award, 2010 Erlang prize from the INFORMS Applied Probability Society and 2019 ACM Sigmetrics Test of Time Paper Award. He is a distinguished young alumni of his alma mater IIT Bombay. He has authored monographs “Gossip algorithms” and “Explaining the success of nearest neighbors in prediction’’. He co-founded machine learning start-up Celect, Inc. which is part of Nike, Inc. since August 2019. Alberto Abadie is an econometrician and empirical microeconomist, with broad disciplinary interests that span economics, political science and statistics. Professor Abadie received his Ph.D. in Economics from MIT in 1999. Upon graduating, he joined the faculty at the Harvard Kennedy School, where he was promoted to full professor in 2005. He returned to MIT in 2016, where he is Professor of Economics and Associate Director of the Institute for Data, Systems, and Society (IDSS).His research areas are econometrics, statistics, causal inference, and program evaluation. Professor Abadie’s methodological research focuses on statistical methods to estimate causal effects and, in particular, the effects of public policies, such as labor market, education, and health policy interventions. He is Associate Editor of Econometrica and AER: Insights, and has previously served as Editor of the Review of Economics and Statistics and Associate Editor of the Journal of Business and Economic Statistics. He is a Fellow of the Econometric Society. 32-G449 (Stata Center - Patil/Kiva Conference Room) Belfer sarah_donahue@hks.harvard.edu

December 05

Add to Calendar 2019-12-05 16:00:00 2019-12-05 17:00:00 America/New_York The Non-Stochastic Control Problem Abstract:Linear dynamical systems are a continuous subclass of reinforcement learning models that are widely used in robotics, finance, engineering, and meteorology. Classical control, since the work of Kalman, has focused on dynamics with Gaussian i.i.d. noise, quadratic loss functions and, in terms of provably efficient algorithms, known statespace realization and observed state. We'll discuss how to apply new machine learning methods which relax all of the above: efficient control with adversarial noise, general loss functions, unknown systems, and partial observation.Bio:Elad Hazan is a professor of computer science at Princeton University. His research focuses on the design and analysis of algorithms for basic problems in machine learning and optimization. Amongst his contributions are the co-development of the AdaGrad optimization algorithm, and the first sublinear-time algorithms for convex optimization. He is the recipient of the Bell Labs prize, (twice) the IBM Goldberg best paper award in 2012 and 2008, a European Research Council grant, a Marie Curie fellowship and Google Research Award (twice). He served on the steering committee of the Association for Computational Learning and has been program chair for COLT 2015. In 2017 he co-founded In8 inc. focusing on efficient optimization and control, acquired by Google in 2018. He is the co-founder and director of Google AI Princeton. 32-G449 (Stata Center, Patil/Kiva Conference Room) Belfer sarah_donahue@hks.harvard.edu