Greybox Program Synthesis: A New Approach to Attack Dataflow Obfuscation

Presented at Black Hat USA 2021, Aug. 5, 2021, 10:20 a.m. (40 minutes)

<div><span>Obfuscation is getting broadly adopted for a wide range of applications and especially to protect intellectual property (IP) in mobile ecosystem (Android, iOS) and embedded systems at large. It is now ubiquitous, and everyone is unwillingly and unknowingly executing obfuscated code. Throughout adoption it also gained maturity, potency making assessing such protection incrementally harder.</span></div><div><span><br></span></div><div><span>It is used in a variety of contexts from malware to famous and widely used mobile applications. In either case, the goal is to protect software secrets, communication protocol, APIs, and its inner working from reverse engineering. Thus, finding new ways to defeat evolving obfuscation schemes is getting more and more important in this endless cat and mouse game.</span></div><div><span><br></span></div><div><span>This talk presents the latest advances in program synthesis applied for deobfuscation. It aims at demystifying this analysis technique by showing how it can be put into action on obfuscation. Especially the implementation Qsynthesis released for this talk shows a complete end-to-end workflow to deobfuscate assembly instructions back in optimized (deobfuscated) instructions reassembled back in the binary.</span></div><div><span><br></span></div><div><span>More specifically the talk presents the greybox synthesizer developed combining two core components, an I/O-based black-box synthesis using precomputed tables and a white-box AST search algorithm backed by symbolic execution. This new approach provides a very good trade-off between accuracy and speed. Various experiments to improve it like expression linearization, expression learning or table evaluation JITing will be presented with both their strengths and weaknesses to address obfuscation schemes attacked.</span></div><div><span><br></span></div><div><span>Among existing schemes to impede program understanding, we show results obtained on various transformations like Mixed-Boolean-Arithmetic (MBA), arithmetic encoding, or virtualization that originates from multiple obfuscators like Tigress, YANSOllvm, or commercial applications.</span></div><div><span><br></span></div><div><span>Finally, we will highlight limitations of the approach, open research problems yielded, and various insights on how to improve the algorithm to bypass roadblocks in order to better leverage program synthesis for deobfuscation.</span></div>

Presenters:

  • Robin David - Security Researcher, Quarkslab
    Robin David (@RobinDavid1) is a Software Security Researcher working at Quarkslab where he leads the automated analysis team. His work implies reverse engineering a wide range of low-level systems. In particular, he likes attacking obfuscation using various techniques like symbolic execution and lately program synthesis. He already had the opportunity to present such topics at a previous Black Hat edition.

Links:

Similar Presentations: