The Devil's in the Dependency: Data-Driven Software Composition Analysis

Presented at Black Hat USA 2020 Virtual, Aug. 5, 2020, 1:30 p.m. (40 minutes)

<p>We all know that lurking within even the most popular open source packages are flaws that can leave carefully constructed applications vulnerable. In fact, 71% of all applications contain flawed open source libraries, many (70.7%) coming from downstream dependencies which might escape the notice of developers. Using graph analytics and a broad data science toolkit, we untangle the web of open source dependencies and flaws and show the best way for developers to navigate this seemingly intractable game of whack-a-mole.</p><p>In this analysis, we examine over 85,000 applications and their use of more than 500k open source libraries. We provide an overview of open source usage showing that typical applications have hundreds or thousands of libraries, with most coming from a cascade of transitive dependencies. We find that proof-of-concept exploits exist for 21.7% of libraries with flaws, and that even very tiny (162 LoC) and very popular (included in 89% of applications) JavaScript libraries can contain exploitable flaws.</p><p>We describe the complex relationship between libraries and security flaws and show that more libraries doesn't necessarily mean more problems -- in fact, we see applications that manage to use thousands of libraries while inheriting few or no flaws. Also, an analysis of exploitability in the data set makes clear that attackers are focusing most heavily on two types of flaws – Insecure Deserialization and Broken Access Control.</p><p>We conclude by examining strategies to manage open source library flaws. We reveal more than 81% of flaws can be fixed with minor patch or revision updates, but updated libraries can themselves be flawed or can disrupt dependencies. We show that developers can prioritize risk mitigation by focusing on the 1% of flaws that are known to exist on an application's executable path and have seen exploitation in the wild.</p>

Presenters:

  • Chris Eng - Chief Research Officer, Veracode
    Chris Eng is Chief Research Officer at Veracode. A founding member of the Veracode team, he is responsible for all research initiatives including applied research and product security. In addition to research, he consults with customers to advance their application security initiatives. Chris is a frequent speaker at industry conferences, and he serves on program committees for Black Hat USA and the Kaspersky Security Analyst Summit. Bloomberg, Fox Business, CBS, and other prominent media outlets have featured Chris in their coverage. Previously, Chris was Technical Director at Symantec (formerly @stake) and an Engineer at the National Security Agency.
  • Benjamin Edwards - Senior Data Scientist, Cyentia Institute
    Dr. Benjamin Edwards joined Cyentia at the beginning of 2019 as hire #1. He was formerly with IBM Research, where he worked in applying advanced machine learning techniques to solve real world security problems and shaped the next generation of analytical security models. Before that he received his PhD. from the University of New Mexico with a research focus that blended the fields of security, data science, and complex systems. His work has lead to a better understanding of global attack trends, the effects of security interventions, and even nation state cybersecurity policy.

Links:

Similar Presentations: