50 Thousand Needles in 5 Million Haystacks: Understanding Old Malware Tricks to Find New Malware Families

Presented at Black Hat Europe 2016, Nov. 3, 2016, 11:15 a.m. (60 minutes)

The malware landscape is characterised by its rapid and constant evolution. Defenders often find themselves one step behind, resulting at best in monetary losses and in most extreme cases even endangering human lives. Corporations with the unique challenges they face, must assume that sooner or later malware infections will get through their security perimeter. Efforts should then be focused on early detection to contain and quickly mitigate the threats before they manage to cause any substantial damage. Even today's most stealth malware, if it's controlled remotely, needs an active network communication for reporting back to the attacker. This activity gives us a competitive visibility advantage. Nowadays we have the computational power and mechanisms to process huge amounts of data. Machine learning give us the algorithms to analyse network data in order to find specific types of behaviour. The challenge is how to use this technology to detect what matters most: malicious behaviours that pose a high risk to companies.

In this talk we address four key challenges related to automatic malware detection in the network traffic: how to detect malware changing its network behaviour over time (e.g. changing different parts of the URL), how to mitigate potential mislabeling of the training data and how to perform large scale multi-class detection. We also introduce a training mechanism that allows to automate the learning process and improves the precision of the classifiers. We present unique algorithms that helps to solve different problems in each of the identified challenges. Results of our research constitute part of a working intrusion detection system that consumes real network traffic from more than 5 million users per day. We show how these methods can be used to learn from well known malware samples, generalise the behaviour and consequently find novel threats. We illustrate the detection performance of each algorithm presenting real examples of malware detected by algorithms described in this work. We also elaborate on how the found infections would have been otherwise missed using traditional detection tools.


  • Veronica Valeros - Malware Researcher, Cognitive Threat Analytics, Cisco Systems
    Veronica Valeros is a security researcher from Argentina. Since 2013, she has worked as a malware analyst at Cognitive Threat Analytics (CTA, a part of Cisco Systems), Prague, Czech Republic, where she specializes in malware network traffic analysis, network behavioral patterns, and threat categorization. Prior to CTA, Veronica worked independently on various projects involving data analysis, machine learning, and malware sandboxing. Veronica is also the co-founder of MatesLab hackerspace, Buenos Aires, Argentina.
  • Lukas Machlica - Researcher, Cognitive Threat Analytics, Cisco Systems
    Lukas Machlica is a researcher at Cognitive Threat Analytics (CTA), Cisco Systems since Jun 2014. His research focuses mainly on classification of the network traffic. Lukas Machlica holds PhD in Cybernetics defended in January 2013 at the Faculty of Applied Sciences, University of West Bohemia in Pilsen, Czech Republic. Prior to Cisco, Lukas was a researcher at University of West Bohemia working in the field of speaker and speech recognition and natural language understanding. Among others, Lukas participated also in biological projects focusing on signal processing of HPLC/MS data, automatic analysis of chemical compounds and image processing of cells with focus on the detection of cells in an image at University of South Bohemia in the researcher center in Nove Hrady, Czech Republic.
  • Karel Bartos - Research Engineer, Cognitive Threat Analytics, Cisco Systems
    Karel Bartos is a researcher at Cognitive Threat Analytics, Cisco Systems. His research focuses mainly on classification of the network traffic, data fusion, and sampling. Prior to Cisco, Karel was a researcher at Czech Technical University in Prague and CESNET, z.s.p.o developing a NetFlow-based anomaly detection system. Karel Bartos holds master degree in Software development at the Faculty of Nuclear Sciences and Physical Engineering of the Czech Technical University in Prague. Currently he is pursuing his PhD at the Department of Computer Science of Czech Technical University in Prague.


Similar Presentations: