The Role of Data Visualization in Improving Machine Learning Models

Presented at BSidesLV 2017, July 26, 2017, noon (30 minutes).

Improving a machine learning model is impossible without a clear understanding of its current performance. In order to get that understanding, the Endgame data science team build Bit Inspector. Bit Inspector is an internal data visualization tool that Endgame uses to communicate the proficiency of our binary classification product, MalwareScore, through various data visualizations. Bit Inspector includes plots and metrics used to judge the ultimate performance of a model overall and across many sample subclasses. It also displays details about individual samples that can provide context about misclassifications. Bit Inspector has grown to include model performance summaries and real time performance tracking, and has proven valuable not just for data scientists, but also for project and product managers and executives to better understand the efficacy of MalwareScore. By tracking the right metrics through data visualizations, a data science team can stay focused on improving the model and communicating that improvement to stakeholders.


Presenters:

  • Phil Roth - Data Scientist - Endgame
    As a data scientist at Endgame, Phil develops data products that help security analysts find and respond to threats. This work has ranged from tuning a machine learning algorithm to best identify malware to building a data exploration platform for HTTP request data. Previously, he developed image processing algorithms for a small defense contractor. While earning a PhD in physics, Phil used a machine learning algorithm and the IceCube detector at the south pole to search for neutrinos from other galaxies.

Links:

Similar Presentations: