No Need to Teach New Tricks to Old Malware: Winning an Evasion Challenge with XOR-based Adversarial

Presented at DeepSec 2020 „The Masquerade“, Unknown date/time (Unknown duration).

Adversarial machine learning is so popular nowadays that Machine Learning (ML) based security solutions became the target of many attacks and, as a consequence, they need to adapt to them to be effective. In our talk, we explore attacks in different ML-models used to detect malware, as part of our experience in the Machine Learning Security Evasion Competition (MLSEC) 2020, sponsored by Microsoft and CUJO AI's Vulnerability Research Lab, in which we managed to finish in first and second positions in the attacker' and defender challenge, respectively.

During the contest's first edition (2019), participating teams were challenged to bypass three ML models in a white box manner. Our team bypassed all three of them and reported interesting insights about the models' weaknesses. This year, the challenge evolved into an attack-and-defense model: the teams should either propose defensive models and attack other teams' models in a black-box manner. Despite the increase in difficulty, our team was able to bypass all models again, which allowed us to present interesting insights regarding attacking models, as well as defending them from adversarial attacks.

In particular, we showed how frequency-based models (e.g., TF-IDF) are vulnerable to the addition of dead function imports, and how models based on raw bytes are vulnerable to payload-embedding obfuscation (e.g., XOR and base64 encoding). One of the main contributions of this work is to show that adversarial attacks are more practical in real life models than previously thought, affecting even anti-virus used by final users.


Presenters:

  • Fabrício Ceschin - UFPR - Ciência da Computação
    Fabrício Ceschin is a Ph.D. student at Federal University of Paraná, Brazil, where he received his M.S. degree in informatics. He was awarded by Google Latin America Research Awards 2017/2018. His research interests include machine learning applied to cybersecurity, such as data streams, concept drift, and adversarial machine learning.

Links:

Similar Presentations: