A binary's call graph is a treasure trove, that has been vastly neglected in threat research. Dozens of features can be extracted from a call graph, which for once are remarkably useful in threat detection, but also, they can be leveraged for more advanced binary classification and authorship attribution research. The basic theorem goes, that the design of a malicious application is resource intensive, thus is likely to stay the same throughout different pieces of malware written by the same authors. In other words, a keylogging module will always be the same keylogging module, no matter where it is being copy pasted to. Being able to formulate such statements as features, thus, is a powerful capability. Of course though, the design of resilient features which preserve this kind of information is challenging and numerous measures can be taken to destroy them.
In this talk it will be examined, how changes induced by a standard compiler affect the call graphs of malicious code, effectively destroying advanced feature sets; and how features can be lifted up to a more abstract level where they can still preserve the base information.