The volume of MSIL malware in the wild is high and rising. This is because MSIL binaries run within the .NET Framework with their byte code interpreted by a virtual machine, and AV engines have been relatively slow to support MSIL emulation and deobfuscation. This might be because any binary written in C#, for example, and compiled to MSIL can typically be disassembled easily to retrieve the original source code, even complete with the original variables. However, commercial and custom MSIL protectors are now very commonly used to hide the source code. These protectors, which introduce varying levels of obfuscation in the compiled MSIL binaries, are heavily employed by malware authors to evade AV detection.
MSIL protectors have adopted two main approaches, the first being the disruption of ILDasm, a tool used to disassemble .NET code, and the second being the obfuscation or even corruption of MSIL metadata. This paper explores the entire gamut of obfuscation techniques employed on MSIL binaries, with a focus on the newest ones, explaining how they would affect signature-based AV detections. We will then go on to discuss a few deobfuscation methods, including a look at the possibilities of handling these in an automated fashion to facilitate family-wise grouping.