Planting Statistically Undetectable Backdoors in Deep Neural Networks
Speaker
Neekon Vafa
Massachusetts Institute of Technology
Host
Henry Corrigan-Gibbs
Massachusetts Institute of Technology
Abstract:
In this talk, I will show how to plant backdoors in a large class of deep neural networks. These backdoors are statistically undetectable in the white-box setting, meaning that the backdoored and honestly trained models are close in total variation distance, even given the full descriptions of the models (e.g., all of the weights). The backdoor provides access to (invariance-based) adversarial examples for every input. However, without the backdoor, no one can generate any such adversarial examples, assuming the worst-case hardness of shortest vector problems on lattices. This talk is based on upcoming work with Andrej Bogdanov and Alon Rosen.
Zoom info:
Meeting ID: 945 5603 5878
Password: 865039