Subjective and objective code similarity measures

Presented at BalCCon2k22 - Loading (2022), Sept. 23, 2022, 4:15 p.m. (45 minutes)

To set the scene, we begin this session with an analysis of characteristics of malicious dropper VBA code employed by APT actors in South Asia. This code regularly evolves but keeps some regular features such as using VBA Forms to store the executable payload in a lightly obfuscated format. The similarity of the code used by the opposing groups is easy to spot for a human researcher but not as obvious to machine algorithms. The focus of the presentation are similarity algorithms such as Normalized Compression Distance, Winnowing, Jaccard similarity as well as common diffing algorithms. The similarity algorithms are one of the main tools we can use to successfully cluster malicious executable as well as the source code. Their background is often in the domain of the natural language processing and plagiarism/copycat detection. We will describe their operation and discuss their performance on a small set of samples attributed to groups we describe in the first part of the session. We compare the effectiveness of the algorithms on unmodified code and code with various levels of normalization. We conclude with a discussion of the scalability of similarity algorithms applied when applied on a large set of samples.

Presenters:

  • Vanja Svajcer
    Vanja Švajcer works as a Technical Leader for Cisco Talos. He is a security researcher with more than 20 years of experience in malware research and threat intelligence. Prior to joining Talos, Vanja worked for SophosLabs and in a Security Research Team at Hewlett Packard Enterprise. Vanja enjoys tinkering with automated analysis systems, reversing binaries and analysing mobile malware. He thinks time spent scraping telemetry data to find indicators of new attacks is well worth the effort. He presented his work at conferences such as Virus Bulletin, RSA, CARO, AVAR, BalcCon and others. In his free time, he is trying to improve his acoustic guitar skills and occasionally attempts to play basketball, which at his age, is not a recommended activity.

Links:

Similar Presentations: