A stratified approach to function fingerprinting in program binaries using diverse features

Research output: Contribution to journalArticlepeer-review

1 Citation (Scopus)

Abstract

Fingerprinting individual functions in binary code is useful in many security applications ranging from digital forensic analysis of malware corpora to the detection of critical security vulnerabilities. However, existing approaches for fingerprinting functions are typically not resilient to code transformation methods or the use of different compilers. Moreover, another common weakness with these approaches is that when they report a similarity, they do not provide reverse engineers with any insight into the underlying evidence. In order to bridge this gap, our paper presents PLUMERIA, an obfuscation-resilient and scalable approach based on a stratified architecture comprised of three layers. The first layer retrieves as many candidates as possible by capturing statistical characteristics, function behavior, and function neighborhood relationships. The second layer then trains a linear conditional random field to learn the correlations between the features of the function and its semantics. This layer is designed to reduce the number of false positives. Finally, the third layer is designed to provide insights into the underlying evidence by collecting the side effects exhibited from the candidates selected by the previous layer. Our study evaluates PLUMERIA in the context of several scenarios: fingerprinting functions in obfuscated/de-obfuscated binaries; fingerprinting functions across different compilers; fingerprinting various vulnerabilities across compilers and versions; and fingerprinting standard library functions. We then benchmark PLUMERIA on real-world projects and malware binaries, comparing it with existing state-of-the-art solutions.

Original languageEnglish
Article number116384
JournalExpert Systems with Applications
Volume193
DOIs
Publication statusPublished - May 1 2022

Keywords

  • Binary code
  • Machine learning
  • Reverse engineering

ASJC Scopus subject areas

  • Engineering(all)
  • Computer Science Applications
  • Artificial Intelligence

Fingerprint

Dive into the research topics of 'A stratified approach to function fingerprinting in program binaries using diverse features'. Together they form a unique fingerprint.

Cite this