Harnessing Intelligence from Malware Repositories

Presented at Black Hat USA 2015, Aug. 6, 2015, 5 p.m. (60 minutes).

The number of unique malware has been doubling every year for over two decades. The majority of effort in malware analysis has focused on methods for preventing malware infection. We view the exponential growth of malware as an underutilized source of intelligence. Given that the number of malware authors are not doubling each year, the large volume of malware must contain evidence that connects them. The challenge is how to extract the connections.

Since a malware is a complex software, it's development necessarily follows software engineering principles, such as modular programming, using third-party libraries, etc. Thus, sharing of code between malware are viable indicators of connection between their creators. However, identifying such shared code is not straightforward. The task is made complicated since to survive in an environment hostile (to it) a malware uses a variety of deceptions, such as polymorphic packing, for the explicit purpose of making it difficult to infer such connections.

By using a combination of two orthogonal approaches - formal program analysis and data mining - we have developed a scalable method to search large scale malware repositories for forensic evidence. Program analyses aid in peeking through the deceptions employed by malware to extract fragments of evidence. Data mining aids in organizing this mass of fragments into a web of connections which can then be used to make a variety of queries, such as to determine whether two apparently disparate cyber attacks are related; to transfer knowledge gained in countering one malware to counter other similar malware; to get a holistic view of cyber threats and to understand and track trends, etc.

This talk will summarize our method, describe VirusBattle - a web service for cloud-based malware analysis - developed at UL Lafayette, and present empirical evidence of viability of mining large scale malware repositories to draw meaningful inferences.


Presenters:

  • Arun Lakhotia - University of Louisiana at Lafayette
    Dr. Arun Lakhotia is a Professor of Computer Science at the University of Louisiana at Lafayette, and the Director of Software Research Laboratory. His research interests are in malware analysis, with particular focus on methods to peer through the protection mechanisms used by malware. His research has led to the development of VirusBattle, an automated malware analysis web service that draws connections between malware using the semantics of their underlying code. Dr. Lakhotia had a brief foray into robotics, leading Team CajunBot in the development of CajunBot, an unmanned Jeep, that participated in the 2007 DARPA Urban Challenge. Dr. Lakhotia earned his PhD in Computer Science from Case Western Reserve University in 1990. His research has been supported by DARPA, AFOSR, AFRL, and ARO. He is the recipient of the 2004 Louisiana Governor's Technology Leader of the Year Award.
  • Vivek Notani - University of Louisiana at Lafayette
    Vivek Notani is a Research Scholar at Software Research Laboratory at the University of Louisiana at Lafayette. His research interests include malware analysis and reverse engineering. Before this, he used to work in the field of humanoid robotics and helped develop India's first indigenous humanoid robot-Acyut. Vivek did his MSc. (Tech.) in Information Systems from BITS-Pilani (India) in 2013.

Links:

Similar Presentations: