In Need of 'Pair' Review: Vulnerable Code Contributions by GitHub Copilot

Presented at Black Hat USA 2022, Aug. 10, 2022, 1:30 p.m. (40 minutes).

On June 29 in 2021 GitHub announced and released their newest tool, 'Copilot' - an 'AI-based Pair Programmer', a deep learning model trained over vast quantities of open-source GitHub code. However, we humans wrote most of that code. And much of it isn't great. It has bugs, it contains dated coding practices, and many repositories even contain dangerously insecure code. Given the vast quantity of garbage code that Copilot has learned from, is it reasonable to trust the code suggestions that it generates?<br> <br>In this talk, we demonstrate that GitHub Copilot is susceptible to writing vulnerabilities in multiple axis, from SQL injections to buffer overflows, use-after-free to cryptographic issues. We try different languages - C, Python, and even Verilog, where we show it also generates hardware bugs (when it can generate hardware at all).<br><br>Overall, we tried 89 different scenarios for Copilot, generating 1,689 suggestions, and found approximately 40% to be vulnerable.

Presenters:

  • Hammond Pearce - Research Assistant Professor, New York University
    Dr. Hammond Pearce is a Research Assistant Professor at New York University Tandon School of Engineering in the Department of Electrical and Computer Engineering, affiliated with the NYU Center for Cybersecurity. He received the B.E. (Hons) degree in Computer Systems Engineering and the PhD in Computer Systems Engineering both from the University of Auckland, Auckland, New Zealand. His primary research focuses are in the cybersecurity of machine learning applications and within industrial and embedded cybersecurity domains. In 2019 he took part in the NASA International Internship Program and worked at NASA Ames in California.
  • Benjamin Tan - Assistant Professor, University of Calgary
    Dr. Benjamin Tan is currently an Assistant Professor at the University of Calgary, in Calgary, Canada. His current research work involves hardware security and robustness of deep learning systems. He previously worked as a Research Assistant Professor and Postdoctoral Associate at the New York University Center for Cybersecurity. Before moving to North America, he was a Professional Teaching Fellow at the University of Auckland. He was awarded a PhD degree by the University of Auckland, with work that focused on the security in heterogeneous multiprocessor systems. He has research interests in embedded systems and computer architecture in general, and has also done some work on IC supply chain security, specifically, protection against IP reverse-engineering.
  • Brendan Dolan-Gavitt - Assistant Professor, New York University
    Brendan Dolan-Gavitt is an Assistant Professor at NYU.
  • Baleegh Ahmad - PhD Candidate, New York University
    Baleegh Ahmad is a PhD Candidate at New York University.

Links:

Similar Presentations: