One of the most memorable vulnerabilities in recent times is widely considered Log4J. Log4J had a huge impact, not just because of how pervasive it was throughout codebases, but because it was difficult to tell whether your codebase was affected. When our team at Trellix realized that the Python Tarfile module, another widely used core library, was still vulnerable to CVE-2007-4559, we wanted to ensure that this didn’t become the next Log4J. As an industry, we must learn from our past mistakes. Within this mindset we created Creosote, an open-source vulnerability scanner written to identify CVE-2007-4559. Using Creosote as a starting framework, we started the process of patching over 65,000 open-source repositories on GitHub.
This talk will provide a technical overview of how Creosote functions and how it differs from other source-code analyzers. An in-depth analysis of how ASTs can be leveraged to perform seamless syntactically aware code patching as well as a summary of all the issues that arise when trying to patch tens of thousands of repositories automatically. The presentation will conclude by covering the process of automatically forking and creating pull requests at a scale of tens of thousands of repositories.