Hermes Attack: Steal DNN Models In AI Privatization Deployment Scenarios

Presented at Black Hat Europe 2020 Virtual, Dec. 9, 2020, 2:20 p.m. (40 minutes).

<p>The AI privatization deployment is becoming a big market in China and the US. For example, company A has a private high-quality DNN model for live-face authentication, and it would like to sell this DNN model to other companies with a license fee, e.g., million dollars per year. In this privatization deployment scenario, company A (the model owner) allows company B to use the DNN model but has the motivation to protect its confidentiality, while company B has the physical access to its own machines and also has the motivation to steal the DNN model to save the license fee for the coming years. Here company A usually protects its model with existing software-hardening and model-protection techniques on the host side, e.g., with secure boot, full disk encryption, runtime access control, and root privilege restrictions, etc. Luckily, all existing model extraction attacks are NOT able to reconstruct the whole DNN model. Thus, at this point, people still have the illusion that the model is safe (or at least the leakage is acceptable). <br><br>However, in this talk, we identify that the PCIe bus connecting the host and the GPU/AI-accelerator is a new attack surface, which allows an adversary to FULLY reconstruct the WHOLE DNN model. We believe this attack technique can also work in other similar scenarios, such as a smartphone with NPU. This attack has three main steps:</p><ol><li>intercept PCIe traffic;</li><li>do reverse-engineering to recover high-level semantics; and</li><li>fully re-construct the DNN model.</li></ol><br>The 2nd reverse-engineering step is very challenging due to the closed-source runtime, driver and GPU instructions, as well as millions of noises due to level out-of-order traffic and unrelated control traffic.<br><br>In the presentation, we will present the details of the attack steps and the algorithm of reconstructing the DNN model. We will also show three demos using 3 real-world GPUs, i.e., NVIDIA Geforce GT 730, NVIDIA GeforceRTX 1080 Ti, and NVIDIA Geforce RTX 2080 Ti, and 3 DNNs (i.e., MINIST, VGG, and Resnet). We believe our attack technique is able to work on all existing GPUs and AI accelerators. <br><br>At last, we will discuss the potential countermeasures to mitigate such attacks. We hope that through our work, people could rethink the security of AI AI privatization deployment and harden the AI systems again from both software and hardware levels.

Presenters:

  • Yueqiang Cheng - Senior Staff Security Scientist, Baidu Security
    Yueqiang Cheng is a Senior Staff Security Scientist at Baidu USA X-Lab. His research interests focus on system security (e.g., SGX, virtualization), CPU side channels, Rowhammer, blockchain security, and autonomous driving security.
  • Yuankun Zhu - PhD Student, UT Dallas
    Yuankun Zhu is a second-year PhD student in computer science at UT Dallas. Before joining UT Dallas, he earned his bachelor degree from Hunan University in computer science. His research interests focus on: optimizing GPU application performance through GPU memory management on integrated CPU-GPU architecture as well as reconstructing DNN models from PCIe traffic while using discrete NVIDIA GPU accelerate.
  • Husheng Zhou - Member of Technical Staff, VMware
    Husheng Zhou is a Member of Technical Staff in the Cloud Platform Business Unit of VMware. He got his PhD degree in December 2018 from the University of Texas at Dallas. He is experienced in distributed shared memory, high-performance computing, deep learning, and real-time system.

Links:

Similar Presentations: