Certified Defenses against Adversarial Examples

Speaker

Aditi Raghunathan

Stanford

Host

Akshay Degwekar, Pritish Kamath and Govind Ramnarayan

MIT CSAIL

Abstract: While neural networks have achieved high accuracy on standard image classification benchmarks, their accuracy drops to nearly zero in the presence of small adversarial perturbations to test inputs. Defenses based on regularization and adversarial training have been proposed, but often followed by new, stronger attacks that defeat these defenses.

Can we somehow end this arms race? In this talk, I will present some methods based on convex relaxations (with a focus on semidefinite programming) that output a certificate that for a given network and test input, no attack can force the error to exceed a certain value. I will then discuss how these certification procedures can be incorporated into neural network training to obtain provably robust networks. Finally, I will present some empirical results on the performance of attacks and different certificates on networks trained using different objectives.

Joint work with Jacob Steinhardt and Percy Liang.

Add to Calendar 2018-10-03 16:00:00 2018-10-03 17:00:00 America/New_York Certified Defenses against Adversarial Examples Abstract: While neural networks have achieved high accuracy on standard image classification benchmarks, their accuracy drops to nearly zero in the presence of small adversarial perturbations to test inputs. Defenses based on regularization and adversarial training have been proposed, but often followed by new, stronger attacks that defeat these defenses. Can we somehow end this arms race? In this talk, I will present some methods based on convex relaxations (with a focus on semidefinite programming) that output a certificate that for a given network and test input, no attack can force the error to exceed a certain value. I will then discuss how these certification procedures can be incorporated into neural network training to obtain provably robust networks. Finally, I will present some empirical results on the performance of attacks and different certificates on networks trained using different objectives. Joint work with Jacob Steinhardt and Percy Liang. 32-G575

Organizer & Contact

Rebecca Yadegar

ryadegar@csail.mit.edu

Part of

Algorithms & Complexity Seminars 2018-2019

Certified Defenses against Adversarial Examples

Speaker

Host

October 03 2018

Location

Organizer & Contact

Part of

November 09

Simple and Efficient Algorithm for Parallel Matchings

June 05

The Sample Complexity of Toeplitz Covariance Estimation

Certified Defenses against Adversarial Examples

Speaker

Host

October 03 2018

Location

Organizer & Contact

Part of

Related Events

November 09

Simple and Efficient Algorithm for Parallel Matchings

June 05

The Sample Complexity of Toeplitz Covariance Estimation