The Devil is in the GAN: Defending Deep Generative Models Against Adversarial Attacks

Presented at Black Hat USA 2021, Aug. 5, 2021, 10:20 a.m. (40 minutes)

Generative Adversarial Networks (GANs) are an emerging AI technology with vast potential for disrupting science and industry. GANs are able to synthesize data from complex, high-dimensional manifolds, e.g., images, text, music, or molecular structures. Potential applications include media content generation and enhancement, synthesis of drugs and medical prosthetics, or generally boosting the performance of AI through semi-supervised learning.

Training GANs is an extremely compute-intensive task that requires highly specialized expert skills. State-of-the-art GANs have sizes reaching billions of parameters and require weeks of Graphical Processing Unit (GPU) training time. A number of GAN model "zoos" already offer trained GANs for download from the internet, and going forward – with the increasing complexity of GANs – it can be expected that most users will have to source trained GANs from – potentially untrusted – third parties.

Surprisingly, while there exists a rich body of literature on evasion and poisoning attacks against conventional, discriminative Machine Learning (ML) models, adversarial threats against GANs – or, more broadly, against Deep Generative Models (DGMs) – have not been analyzed before. To close this gap, we will introduce in this talk a formal threat model for training-time attacks against DGM. We will demonstrate that, with little effort, attackers can backdoor pre-trained DGMs and embed compromising data points which, when triggered, could cause material and/or reputational damage to the organization sourcing the DGM. Our analysis shows that the attacker can bypass naïve detection mechanisms, but that a combination of static and dynamic inspections of the DGM is effective in detecting our attacks.


Presenters:

  • Mathieu Sinn - Research Staff Member, IBM Research Europe
    Mathieu Sinn is a Senior Technical Staff Member and Manager in the IBM Research Europe laboratory in Dublin, Ireland. He has more than 15 years of experience working on fundamental and applied aspects of AI and Machine Learning. In recent years, Mathieu and his team have focused on AI Security, including works on the robustness of AI against evasion/poisoning attacks, and mitigation of privacy issues e.g. via differentially private Machine Learning. Mathieu and his team created leading open source projects in these areas (Adversarial Robustness Toolbox, Differential Privacy Library), and are currently working with DARPA under their GARD (Guaranteeing AI Robustness against Deception) program. Mathieu has published/filed more than 40 peer-reviewed papers and 30 US patents. For his technical achievements in IBM, he has received numerous corporate Technical Accomplishment and Innovation Awards. He is a Data Science Thought Leader certified by The Open Group and regularly serves as a reviewer for top AI conferences, external PhD committees and government research programs.
  • Ambrish Rawat - Research Staff Member, IBM Research Europe
    Ambrish Rawat is an Advisory Research Staff Member in the IBM Research Europe laboratory in Dublin, Ireland. He holds a Master of Technology from the Indian Institute of Technology, Delhi, and a Master of Philosophy from the University of Cambridge, UK. Ambrish has worked on a vast array of cutting-edge research challenges in AI and Machine Learning, including Probabilistic Modeling, Bayesian Deep Neural Networks, Neural Architecture Search, Adversarial Machine Learning, and Federated Learning. Ambrish has extensively published his work at top AI conferences, filed multiple US and international patents, and is an active contributor to open source software projects.
  • Killian Levacher - Research Staff Member, IBM Research Europe
    Killian Levacher is a research scientist in the IBM Research Europe laboratory in Dublin Ireland. He is also a principal investigator in the H2020 AI4Media Research Consortium focusing on the various dimensions of Trusted AI such as Adversarial Robustness, Explainability, Privacy and Fairness dimensions of AI systems. Killian obtained his PhD in Computer Science and Statistics at Trinity College Dublin. He is the author of over 26 top tier academic research articles as well as 18 patents in the areas of Artificial Intelligence and Blockchain systems, with more than 10 years of experience working on fundamental and applied aspects of AI and Machine Learning. His research currently focuses on investigating novel threat models targeting Generative Adversarial Models. He is also one of the main contributors to the open source Linux Foundation Adversarial Robustness Toolkit project.

Links:

Similar Presentations: