Challenges in Reliable Machine Learning

Speaker

Kamalika Chaudhuri

University of California, San Diego

Host

Philippe Rigollet

MIT, Department of Mathematics

Abstract:
As machine learning is increasingly used in real applications, there is a need for reliable and robust methods. In this talk, we will discuss two such challenges that arise in reliable machine learning. The first is sample selection bias, where training data is available from a distribution conditioned on a sample selection policy, but the resultant classifier needs to be evaluated on the entire population. We will show how we can use active learning to get a small amount of labeled data from the entire population that can be used to correct this kind of sample selection bias. The second is robustness to adversarial examples -- slight strategic perturbations of legitimate test inputs that cause misclassification. We next look at adversarial examples in the context of a simple non-parametric classifier -- the k-nearest neighbor classifier, and look at its robustness properties. We provide bounds on its robustness as a function of k, and propose a more robust 1-nearest neighbor classifier.

Joint work with Songbai Yan, Tara Javidi, Yaoyuan Yang, Cyrus Rastchian, Yizhen Wang and Somesh Jha.

Bio:
Kamalika Chaudhuri received a Bachelor of Technology from the Indian Institute of Technology, Kanpur, and a PhD in Computer Science from the University of California, Berkeley. Currently, she is an Associate Professor at the University of California, San Diego. She is a recipient of the NSF Career Award, Hellman Faculty Fellowship, and Google and Bloomberg Faculty Awards.

Kamalika's research is on the foundations of trustworthy machine learning -- which includes problems such as learning from sensitive data while preserving privacy, learning under sampling bias, and in the presence of an adversary. She is also broadly interested in a number of topics in learning theory, such as non-parametric methods, online learning, and active learning.

Add to Calendar 2019-10-17 16:15:00 2019-10-17 17:15:00 America/New_York Challenges in Reliable Machine Learning Abstract:As machine learning is increasingly used in real applications, there is a need for reliable and robust methods. In this talk, we will discuss two such challenges that arise in reliable machine learning. The first is sample selection bias, where training data is available from a distribution conditioned on a sample selection policy, but the resultant classifier needs to be evaluated on the entire population. We will show how we can use active learning to get a small amount of labeled data from the entire population that can be used to correct this kind of sample selection bias. The second is robustness to adversarial examples -- slight strategic perturbations of legitimate test inputs that cause misclassification. We next look at adversarial examples in the context of a simple non-parametric classifier -- the k-nearest neighbor classifier, and look at its robustness properties. We provide bounds on its robustness as a function of k, and propose a more robust 1-nearest neighbor classifier.Joint work with Songbai Yan, Tara Javidi, Yaoyuan Yang, Cyrus Rastchian, Yizhen Wang and Somesh Jha.Bio:Kamalika Chaudhuri received a Bachelor of Technology from the Indian Institute of Technology, Kanpur, and a PhD in Computer Science from the University of California, Berkeley. Currently, she is an Associate Professor at the University of California, San Diego. She is a recipient of the NSF Career Award, Hellman Faculty Fellowship, and Google and Bloomberg Faculty Awards. Kamalika's research is on the foundations of trustworthy machine learning -- which includes problems such as learning from sensitive data while preserving privacy, learning under sampling bias, and in the presence of an adversary. She is also broadly interested in a number of topics in learning theory, such as non-parametric methods, online learning, and active learning. 32-G449 (Stata Center, Patil/Kiva Conference Room)

Organizer & Contact

Marcia G. Davidson

marcia@csail.mit.edu

617-253-3049

Part of

Machine Learning Seminar Series 2019

Challenges in Reliable Machine Learning

Speaker

Host

October 17 2019

Location

Organizer & Contact

Part of

December 05

The Non-Stochastic Control Problem

October 29

David Spiegelhalter: Communicating uncertainty about facts, numbers and science

Challenges in Reliable Machine Learning

Speaker

Host

October 17 2019

Location

Organizer & Contact

Part of

Related Events

December 05

The Non-Stochastic Control Problem

October 29

David Spiegelhalter: Communicating uncertainty about facts, numbers and science