Enhancing Control Flow Graph Based Binary Function Identification

Presented at DeepSec 2017 „Science First!“, Unknown date/time (Unknown duration)

Detection of binary functions in compiled code is a major stepping stone towards any advanced binary analysis technique. Nucleus [1] is a novel algorithm based on the idea of using the interprocedural control flow graph to detect function boundaries. Building upon this technology we propose a new approach to solve the related problem of identifying previously-seen known functions within a binary. Our idea is based on comparing the control flow graphs (CFGs) of unknown functions from a binary to known functions from a previously generated database. Compared to traditional approaches, our method is aware of the underlying graph matching problem being performed on CFGs of binary code: First, it utilizes instruction level knowledge about basic blocks as additional constraints for graph isomorphism. Second, optimizations and transformations introduced by different compilers affecting the shape of the CFG are taken into account. Our approach aims to avoid false positives (wrongly assigning a known function symbol to an unknown function) at all cost: The evaluation shows that this method is very effective in reducing false positive matches (below one percent in most cases) maintaining recall rates as high as 72.8% when matching functions across two different nginx versions (1.12.1 and 1.10.3).



Similar Presentations: