SIGMA: A semantic integrated graph matching approach for identifying reused functions in binary code

Saed Alrabaee, Paria Shirani, Lingyu Wang, Mourad Debbabi

Research output: Contribution to journalArticlepeer-review

53 Citations (Scopus)


The capability of efficiently recognizing reused functions for binary code is critical to many digital forensics tasks, especially considering the fact that many modern malware typically contain a significant amount of functions borrowed from open source software packages. Such a capability will not only improve the efficiency of reverse engineering, but also reduce the odds of common libraries leading to false correlations between unrelated code bases. In this paper, we propose SIGMA, a technique for identifying reused functions in binary code by matching traces of a novel representation of binary code, namely, the Semantic Integrated Graph (SIG). The SIG s enhance and merge several existing concepts from classic program analysis, including control flow graph, register flow graph, and function call graph into a joint data structure. Such a comprehensive representation allows us to capture different semantic descriptors of common functionalities in a unified manner as graph traces, which can be extracted from binaries and matched to identify reused functions, actions, or open source software packages. Experimental results show that our approach yields promising results. Furthermore, we demonstrate the effectiveness of our approach through a case study using two malware known to share common functionalities, namely, Zeus and Citadel.

Original languageEnglish
Pages (from-to)S61-S71
JournalDigital Investigation
Issue numberS1
Publication statusPublished - Mar 1 2015
Externally publishedYes


  • Binary program analysis
  • Digital forensics
  • Function identification
  • Malware forensics
  • Reverse engineering

ASJC Scopus subject areas

  • Pathology and Forensic Medicine
  • Information Systems
  • Computer Science Applications
  • Medical Laboratory Technology
  • Law


Dive into the research topics of 'SIGMA: A semantic integrated graph matching approach for identifying reused functions in binary code'. Together they form a unique fingerprint.

Cite this