BinGold: Towards robust binary analysis by extracting the semantics of binary code as semantic flow graphs (SFGs)

Saed Alrabaee, Lingyu Wang, Mourad Debbabi

Research output: Contribution to conferencePaperpeer-review


Binary analysis is useful in many practical applications, such as the detection of malware or vulnerable software components. However, our survey of the literature shows that most existing binary analysis tools and frameworks rely on assumptions about specific compilers and compilation settings. It is well known that techniques such as refactoring and light obfuscation can significantly alter the structure of code, even for simple programs. Applying such techniques or changing the compiler and compilation settings can significantly affect the accuracy of available binary analysis tools, which severely limits their practicability, especially when applied to malware. To address these issues, we propose a novel technique that extracts the semantics of binary code in terms of both data and control flow. Our technique allows more robust binary analysis because the extracted semantics of the binary code is generally immune from light obfuscation, refactoring, and varying the compilers or compilation settings. Specifically, we apply data-flow analysis to extract the semantic flow of the registers as well as the semantic components of the control flow graph, which are then synthesized into a novel representation called the semantic flow graph (SFG). Subsequently, various properties, such as reflexive, symmetric, antisymmetric, and transitive relations, are extracted from the SFG and applied to binary analysis. We implement our system in a tool called BinGold and evaluate it against thirty binary code applications. Our evaluation shows that BinGold successfully determines the similarity between binaries, yielding results that are highly robust against light obfuscation and refactoring. In addition, we demonstrate the application of BinGold to two important binary analysis tasks: binary code authorship attribution, and the detection of clone components across program executables. The promising results suggest that BinGold can be used to enhance existing techniques, making them more robust and practical.

Original languageEnglish
Publication statusPublished - 2016
Externally publishedYes
Event16th Annual USA Digital Forensics Research Conference, DFRWS 2016 USA - Seattle, United States
Duration: Aug 7 2016Aug 10 2016


Conference16th Annual USA Digital Forensics Research Conference, DFRWS 2016 USA
Country/TerritoryUnited States


  • Assembly instructions
  • Binary Analysis
  • Binary relation
  • Data flow analysis
  • Reverse engineering
  • Semantic features
  • Semantic flow graph

ASJC Scopus subject areas

  • Information Systems


Dive into the research topics of 'BinGold: Towards robust binary analysis by extracting the semantics of binary code as semantic flow graphs (SFGs)'. Together they form a unique fingerprint.

Cite this