Rich headers: leveraging the mysterious artifact of the PE format

Presented at VB2019, Oct. 4, 2019, 11 a.m. (30 minutes).

Ever since the release of *Visual Studio 97 SP3*, *Microsoft* started placing an undocumented chunk of data between the DOS and NT headers into every native PE binary produced by its linker without any possibility to opt out. The data contains information about the build environment and the scale of the project stored in a simple yet effective way using blocks of the following values: product ID, its build number and the number of times it was used during the compilation. Several research papers on this topic have been released over the years, coming up with the name "Rich Header" and shedding some light on its purpose and structure, but we feel that it has never been used to its full potential by the security industry. When an analyst is encountering a rare custom malware involved in an APT, grasping at straws to draw conclusions about the case, this mysterious structure could provide reasonable clues. Not only does it reveal type of components involved in the project behind the malware, and the compiler tools used, but forming an abounding set of variations it also helps with the look-up for similar samples. We introduce a hierarchy of similarity levels together with real-world examples where they could successfully be applied. For various crimeware kits, which are (re)distributed on daily basis, the header could suggest whether their source code is available more widely or completely under control of a single actor. Moreover, the header from their encapsulating malware packer, often manifesting certain anomalies, could cluster a larger set of samples of the same nature. These inconsistencies could be easily identified and turned into heuristics based on the situations such as: an unusual offset or the size of the header, an invalid product ID or its combination with the build version, the image size not corresponding to the magnitude of the project, etc. In our talk we will also showcase our in-house designed database infrastructure and the tooling we've built around it: similarity lookup, rule-based notification system for malware hunting and the detection of anomalies. The database is currently holding the Rich Header information for tens of millions of executables and is processing a live feed of newly incoming files. It is currently growing by ~180 thousand unique records per day.

Presenters:

  • Peter Kálnai - ESET   as Peter Kalnai
    Peter Kálnai Peter Kálnai is a malware researcher at ESET. As a speaker, he has represented ESET at various international conferences including Virus Bulletin, AVAR, CARO Workshop, OFFZONE and cyberCentral. He hates mostly malware like crypto-ransomware, because it displays hardly any inventiveness and has a very destructive impact on the victim. His golden rule for cyberspace is always to prioritise security measures over user comfort. In his free time he enjoys foosball and travelling.
  • Michal Poslušný - ESET   as Michal Poslusny
    Michal Poslušný Michal Poslušný is a malware researcher working at ESET, where he is mainly responsible for reverse engineering of complex malware threats. He also works on developing various internal projects and tools and has actively participated in research presented at Virus Bulletin, AVAR, CARO Workshop and OFFZONE conferences in the past. In his free time he likes to play online games, develop fun projects and spend time with his family.

Links:

Similar Presentations: