Towards Transparent Dynamic Binary Instrumentation using Virtual Machine Introspection

Presented at REcon 2015, June 19, 2015, 5 p.m. (30 minutes)

The idea is simple enough: Binary instrumentation as done by DynamoRIO and PIN can easily be detected and evaded by malicious binaries, as proven at last year's Black Hat USA by Li et al. [3]. To overcome this limitation I've built yet another prototype which uses virtual machine introspection techniques to do the instrumentation: Once the execution of a binary within a VM reaches the point where instrumented code should be executed, the following happens: First, the VM is stopped, and the instrumented code is injected into a new page within the guest. Afterwards the current execution context of the inspected binary is saved, the instruction pointer is set to the code on the new page executing the instrumentation callback. Once the callback finishes, the saved execution context of the guest is restored by the hypervisor and the instrumented binary continues along its normal execution path. This makes the instrumentation much harder to detect (only way I can think of are timing attacks) at the cost of instrumentation granularity. The system is surprisingly easy to use in practice: The user programs a callback function normally in C while still being able to access the whole execution context of the instrumented binary via pointers - if the ABI of the host and the guest OS match, it is even possible to make calls to libc functions. A compiled (a matter of typing "make") callback function can then be triggered by the system when reading/writing a certain memory location, once execution reaches a certain point or on each basic block. This is much less than PIN, DynamoRIO and others can do but suffices for most cases in which you have to deal with heavily obfuscated code. I've also successfully used the system in order to generate function call traces of a binary and to perform a timing attack on several poorly implemented software protection schemes (yes, this reads itself as "automatically bruteforcing a license key/ctf flag/whatever program input character by character").

In the presentation I'd explain the inner workings of the prototype and show some nice applications to real world and CTF code.


Presenters:

Links:

Similar Presentations: