November 29

Add to Calendar 2017-11-29 16:00:00 2017-11-29 17:00:00 America/New_York Machine Learning for Healthcare Data AbstractWe will discuss multiple ways in which healthcare data is acquired and machine learning methods are currently being introduced into clinical settings. This will include: 1) Modeling disease trends and other predictions, including joint predictions of multiple conditions, from electronic health record (EHR) data using Gaussian processes. 2) Predicting surgical complications and transfer learning methods for combining databases 3) Using mobile apps and integrated sensors for improving the granularity of recorded health data for chronic conditions and 4) The combination of mobile app and social network information in order to predict the spread of contagious disease. Current work in these areas will be presented and the future of machine learning contributions to the field will be discussed.BioKatherine Heller is an Assistant Professor in Statistical Science at Duke University. She is the recent recipient of a Google faculty research award, a first round BRAIN initiative award from the NSF, as well as a CAREER award. She received her PhD from the Gatsby Computational Neuroscience Unit at UCL, and was a postdoc at the University of Cambridge on an EPSRC postdoc fellowship, and at MIT on an NSF postdoc fellowship. 32-G882 (Stata Center - Hewlett Room)

November 15

Add to Calendar 2017-11-15 16:00:00 2017-11-15 17:00:00 America/New_York From Reading Comprehension to Open-Domain Question Answering AbstractEnabling a computer to understand a document so that it can answer comprehension questions is a central, yet unsolved, goal of NLP. This task of reading comprehension (i.e., question answering over a passage of text) has received a resurgence of interest, due to the creation of large-scale datasets and well-designed neural network models.I will talk about how we build simple yet effective models for advancing a machine’s ability at reading comprehension. I’ll focus on explaining the logical structure behind these neural architectures and discussing the capacities of these models as well as their limits. Next, I’ll talk about how we combine state-of-the-art reading comprehension systems with traditional IR modules to build a new generation of open-domain question answering systems. Our system is much simpler than traditional QA systems and able to answer questions efficiently over the full English Wikipedia and shows great promise on multiple QA benchmarks. I’ll conclude with the main challenges and directions for future research.BioDanqi Chen is a Ph.D. candidate in Computer Science at Stanford University, advised by Christopher Manning. She works on deep learning for natural language processing, and is particularly interested in the intersection between text understanding and knowledge representation/reasoning. Her research spans from machine comprehension/question answering to knowledge base construction and syntactic parsing, with an emphasis on building principled yet highly effective models. She is a recipient of a Facebook Fellowship, a Microsoft Research Women’s Fellowship and outstanding paper awards at ACL'16 and EMNLP’17. Previously, she received her B.S. with honors from Tsinghua University in 2012. 32-G449 (Stata Center - Patil / Kiva Conference Room)

November 08

Add to Calendar 2017-11-08 16:00:00 2017-11-08 17:00:00 America/New_York Black Box Variational Inference: Scalable, Generic Bayesian Computation and its Applications Abstract:Probabilistic generative models are robust to noise, uncover unseen patterns, and make predictions about the future. Probabilistic generative models posit hidden structure to describe data. They have addressed problems in neuroscience, astrophysics, genetics, and medicine. The main computational challenge is computing the hidden structure given the data --- posterior inference. For most models of interest, computing the posterior distribution requires approximations like variational inference. Classically, variational inference was feasible to deploy in only a small fraction of models. We develop black box variational inference. Black box variational inference is a variational inference algorithm that is easy to deploy on a broad class of models and has already found use in neuroscience and healthcare. The ideas around black box variational inference also facilitate new kinds of variational methods such as hierarchical variational models. Hierarchical variational models improve the approximation quality of variational inference by building higher-fidelity approximations from coarser ones. Black box variational inference opens the doors to new models and better posterior approximations. Lastly, I will discuss some of the challenges that variational methods face moving forward.Bio:Rajesh Ranganath is a postdoc at Columbia University's Department of Statistics and a research affiliate at MIT's Institute for Medical Engineering and Science. He will be an assistant professor at the Courant Institute of Mathematical Sciences at NYU starting January 2018. His research interests include approximate inference, model checking, Bayesian nonparametrics, and machine learning for healthcare. Rajesh recently completed his PhD at Princeton with David Blei. Before starting his PhD, Rajesh worked as a software engineer for AMA Capital Management. He obtained his BS and MS from Stanford University with Andrew Ng and Dan Jurafsky. Rajesh has won several awards and fellowships including the NDSEG graduate fellowship and the Porter Ogden Jacobus Fellowship, given to the top four doctoral students at Princeton University. 32-G575 (Stata Center - 5th Floor)

October 25

Add to Calendar 2017-10-25 16:00:00 2017-10-25 17:00:00 America/New_York Geometric Deep Learning: Going Beyond Euclidean Data Abstract: In the past decade, deep learning methods have achieved unprecedented performance on a broad range of problems in various fields from computer vision to speech recognition. So far research has mainly focused on developing deep learning methods for Euclidean-structured data. However, many important applications have to deal with non-Euclidean structured data, such as graphs and manifolds. Such geometric data are becoming increasingly important in computer graphics and 3D vision, sensor networks, drug design, biomedicine, recommendation systems, and web applications. The adoption of deep learning in these fields has been lagging behind until recently, primarily since the non-Euclidean nature of objects dealt with makes the very definition of basic operations used in deep networks rather elusive. In this talk, I will introduce the emerging field of geometric deep learning on graphs and manifolds, overview existing solutions and applications as well as key difficulties and future research directions.(based on M. Bronstein, J. Bruna, Y. LeCun, A. Szlam, P. Vandergheynst, "Geometric deep learning: going beyond Euclidean data", IEEE Signal Processing Magazine 34(4):18-42, 2017)Bio:Michael Bronstein (PhD with distinction 2007, Technion, Israel) is a professor at USI Lugano, Switzerland and Tel Aviv University, Israel. He also serves as a Principal Engineer at Intel Perceptual Computing. During 2017-2018 he is a fellow at the Radcliffe Institute for Advanced Study at Harvard University. Michael's main research interest is in theoretical and computational methods for geometric data analysis. He authored over 150 papers, the book Numerical geometry of non-rigid shapes (Springer 2008), and over 20 granted patents. He was awarded three ERC grants, Google Faculty Research award (2016), and Rudolf Diesel fellowship (2017) at TU Munich. He was invited as a Young Scientist to the World Economic Forum, an honor bestowed on forty world’s leading scientists under the age of forty. Michael is a Senior Member of the IEEE, alumnus of the Technion Excellence Program and the Academy of Achievement, ACM Distinguished Speaker, and a member of the Young Academy of Europe. In addition to academic work, Michael is actively involved in commercial technology development and consulting to start-up companies. He was a co-founder and technology executive at Novafora (2005-2009) developing large-scale video analysis methods, and one of the chief technologists at Invision (2009-2012) developing low-cost 3D sensors. Following the multi-million acquisition of Invision by Intel in 2012, Michael has been one of the key developers of the Intel RealSense technology. 32-G882 (Stata Center - Hewlett Room)

October 18

Add to Calendar 2017-10-18 16:00:00 2017-10-18 17:00:00 America/New_York Failures of Gradient-Based Deep Learning In recent years, deep learning has become the go-to solution for a broad range of applications, with a long list of success stories. However, it is important, for both theoreticians and practitioners, to also understand the associated difficulties and limitations. In this talk, I'll describe several simple problems for which commonly-used deep learning approaches either fail or suffer from significant difficulties, even if one is willing to make strong distributional assumptions. We illustrate these difficulties empirically, and provide theoretical insights explaining their source and (sometimes) how they can be remedied.Includes joint work with Shai Shalev-Shwartz and Shaked Shammah.Ohad Shamir is a faculty member in the Department of Computer Science and Applied Mathematics at the Weizmann Institute of Science, Israel. He received a PhD in computer science from the Hebrew University in 2010, advised by Prof. Naftali Tishby. Between 2010-2013 he was a postdoctoral and associate researcher at Microsoft Research. His research focuses on machine learning, with emphasis on algorithms which combine practical efficiency and theoretical insight. He is also interested in the many intersections of machine learning with related fields, such as optimization, statistics, theoretical computer science and AI. 32-G882 (Stata Center - Hewlett Room)

October 11

Add to Calendar 2017-10-11 14:00:00 2017-10-11 17:00:00 America/New_York Tutorial on Deep Learning with Apache MXNet Gluon Abstract:This tutorial introduces Gluon, a flexible new interface that pairs MXNet’s speed with a user-friendly frontend. Symbolic frameworks like Theano and TensorFlow offer speed and memory efficiency but are harder to program. Imperative frameworks like Chainer and PyTorch are easy to debug but they can seldom compete with the symbolic code when it comes to speed. Gluon reconciles the two, removing a crucial pain point by using just-in-time compilation and an efficient runtime engine for efficiency.In this crash course, we’ll cover deep learning basics, the fundamentals of Gluon, advanced models, and multiple-GPU deployments. We will walk you through MXNet’s NDArray data structure and automatic differentiation tools. Well show you how to define neural networks at the atomic level, and through Gluon’s predefined layers. We’ll demonstrate how to serialize models and build dynamic graphs. Finally, we will show you how to hybridize your networks, simultaneously enjoying the benefits of imperative and symbolic deep learning.Pre-setup:1) Preferably, have Python3 installed (but Python2.7 will still work). 2) Install mxnet, either building from source or usingpip install mxnet —-preDetailed instructions: https://mxnet.incubator.apache.org/get_started/install.html 3) Install Jupyter http://jupyter.readthedocs.io/en/latest/install.html 4) Clone a copy of the tutorials:https://github.com/zackchase/mxnet-the-straight-dopeAbout the Speaker:Alex Smola is a well-known figure in machine learning, and recently joined Amazon AWS as their director of Machine Learning and Deep Learning. 54-100

September 20

Add to Calendar 2017-09-20 16:00:00 2017-09-20 17:00:00 America/New_York Sample-Efficient Reinforcement Learning with Rich Observations This talk considers a core question in reinforcement learning (RL): How can we tractably solve sequential decision making problems where the learning agent receives rich observations? We begin with a new model called Contextual Decision Processes (CDPs) for studying such problems, and show that it encompasses several prior setups to study RL such as MDPs and POMDPs. Several special cases of CDPs are, however, known to be provably intractable in their sample complexities. To overcome this challenge, we further propose a structural property of such processes, called the Bellman Rank. We find that the Bellman Rank of a CDP (and an associated class of functions) provides an intuitive measure of the hardness of a problem in terms of sample complexity and is small in several practical settings. In particular, we propose an algorithm, whose sample complexity scales with the Bellman Rank of the process, and is completely independent of the size of the observation space of the agent. We also show that our techniques are robust to our modeling assumptions, and make connections to several known results as well as highlight novel consequences of our results.This talk is based on joint work with Nan Jiang, Akshay Krishnamurthy, John Langford and Rob Schapire.Alekh Agarwal is a researcher in the New York lab of Microsoft Research, prior to which he obtained his PhD from UC Berkeley. Alekh’s research currently focuses on topics in interactive machine learning, including contextual bandits, reinforcement learning and online learning. Previously, he has worked on several topics in optimization including stochastic and distributed optimization. He has won several awards for his research including the NIPS 2015 best paper award. 32-G882 (Stata Center - Hewlett Room)

September 18

Add to Calendar 2017-09-18 15:00:00 2017-09-18 16:00:00 America/New_York Libratus: Beating Top Humans in No-Limit Poker Abstract: Poker has been a challenge problem in AI and game theory for decades. As a game of imperfect information, poker involves obstacles not present in games like chess or Go. No program has been able to beat top professionals in large poker games, until now. In January 2017, our AI Libratus decisively defeated a team of the top professional players in heads-up no-limit Texas Hold'em. Libratus features a number of innovations which form a new approach to AI for imperfect-information games. The algorithms are domain-independent and can be applied to a variety of strategic interactions involving hidden information. This is joint work with Tuomas Sandholm.Bio: Noam Brown is a PhD student in computer science at Carnegie Mellon University advised by Professor Tuomas Sandholm. His research combines reinforcement learning and game theory to develop AIs capable of strategic reasoning in imperfect-information interactions. He has applied this research to creating Libratus, the first AI to defeat top humans in no-limit Texas Hold'em. His current research is focused on expanding the applicability of the technology behind Libratus to other domains. Seminar Room G449 (Patil/Kiva)