OBA2: An onion approach to binary code authorship attribution

Saed Alrabaee, Noman Saleem, Stere Preda, Lingyu Wang, Mourad Debbabi

Research output: Contribution to journalArticlepeer-review

51 Citations (Scopus)


A critical aspect of malware forensics is authorship analysis. The successful outcome of such analysis is usually determined by the reverse engineer's skills and by the volume and complexity of the code under analysis. To assist reverse engineers in such a tedious and error-prone task, it is desirable to develop reliable and automated tools for supporting the practice of malware authorship attribution. In a recent work, machine learning was used to rank and select syntax-based features such as n-grams and flow graphs. The experimental results showed that the top ranked features were unique for each author, which was regarded as an evidence that those features capture the author's programming styles. In this paper, however, we show that the uniqueness of features does not necessarily correspond to authorship. Specifically, our analysis demonstrates that many "unique" features selected using this method are clearly unrelated to the authors' programming styles, for example, unique IDs or random but unique function names generated by the compiler; furthermore, the overall accuracy is generally unsatisfactory. Motivated by this discovery, we propose a layered Onion Approach for Binary Authorship Attribution called OBA2. The novelty of our approach lies in the three complementary layers: preprocessing, syntax-based attribution, and semantic-based attribution. Experiments show that our method produces results that not only are more accurate but have a meaningful connection to the authors' styles.

Original languageEnglish
Pages (from-to)s94-s103
JournalDigital Investigation
Issue numberSUPPL. 1
Publication statusPublished - May 2014
Externally publishedYes


  • Authorship attribution
  • Binary program analysis
  • Digital forensics
  • Malware forensics
  • Reverse engineering

ASJC Scopus subject areas

  • Pathology and Forensic Medicine
  • Information Systems
  • Computer Science Applications
  • Medical Laboratory Technology
  • Law


Dive into the research topics of 'OBA2: An onion approach to binary code authorship attribution'. Together they form a unique fingerprint.

Cite this