How to Detect Malicious Certificates in Your Spare Time

Presented at BSidesDC 2017, Oct. 8, 2017, 10:30 a.m. (50 minutes)

We present machine learning algorithms for detecting malicious certificates with a high level of accuracy. The performance of our algorithm meets the demands of deploying such models in a product. Interestingly, the key ingredients for building such models are all publicly available! However, one still needs to connect the dots, i.e. collect represent good and malicious certificates from various online sources and/or network traffic as well as identify which “cookie-cutter” machine learning algorithm, available as Python libraries, to use. Key takeaways from our presentation: - Understand how to leverage the fact that SSL certificates contain information in a structured format to build machine learning models. - It is embarrassingly easy to build algorithms for detecting malicious certificates using Python libraries. We will share results for three of them—Logistic Regression, Support Vector Machines, and Random Forests. - Identify which attributes are important for distinguishing between malicious and legitimate certificates. - The main challenges in deploying these models is the low prevalence environment—i.e., on an average, your network traffic will have orders of magnitude lower malicious certificates compared to legitimate ones. How do we fine tuning machine learning algorithms to perform robustly in such environments?

Presenters:

  • Jason Reeves - Threat Researcher at Fidelis Cybersecurity
    Jason Reaves is a threat researcher at Fidelis Cybersecurity. His work primarily focuses on reverse engineering data structures, algorithms and botnet protocols found in malware. He develops signatures to detect threats, scripts and programs to automate data or malware configuration collection and framework development to automatically harvest threat related data from various sources to further various research projects. Before joining Fidelis Cybersecurity, he worked primarily on Banking Trojan research in the financial industry in order to create frameworks pretending to be infected clients in order to automatically harvest configuration and targeting data.
  • Khaled Al-Hassanieh - Senior Software Engineer at FIdelis Cybersecurity
    Khaled Al-Hassanieh is a senior software engineer at Fidelis Cybersecurity. His work combines software engineering and machine learning. He develops and productizes models for malware detection and other security applications. Before joining Fidelis Cybersecurity, he was a postdoctoral researcher at Los Alamos and Oak Ridge National Laboratories. As a theoretical physicist, he developed theoretical and numerical models to study condensed matter systems. His research has led to 29 peer-reviewed articles in renowned journals. Khaled holds a Ph.D. in Physics from Florida State University.
  • Abhishek Sharma - Senior Data Scientist at Fidelis Cybersecurity
    Abhishek Sharma is a Data Scientist and Team Lead at Fidelis Cybersecurity. He develops predictive models for detecting malware and data science based products to enhance the productivity of security analysts. Prior to Fidelis Cybersecurity, he was a Researcher at NEC Labs America where he researched how to use machine learning and data science to improve the efficiency and robustness of complex physical systems found in the power and manufacturing sectors. He has published more than 20 articles in peer-reviewed journals and conferences, and holds 7 patents. He received his Ph.D. in Computer Science from the University of Southern California, Los Angeles, and his 5-year Integrated Masters degree from the Indian Institute of Technology, Delhi.

Links:

Similar Presentations: