BinEye: Towards Efficient Binary Authorship Characterization Using Deep Learning

Saed Alrabaee, El Mouatez Billah Karbab, Lingyu Wang, Mourad Debbabi

Research output: Chapter in Book/Report/Conference proceedingConference contribution

19 Citations (Scopus)

Abstract

In this paper, we present BinEye, an innovative tool which trains a system of three convolutional neural networks to characterize the authors of program binaries based on novel sets of features. The first set of features is obtained by converting an executable binary code into a gray image; the second by transforming each executable into a series of bytecode; and the third by representing each function in terms of its opcodes. By leveraging advances in deep learning, we are then able to characterize a large set of authors. This is accomplished even without the missing features and despite the complications arising from compilation. In fact, BinEye does not require any prior knowledge of the target binary. More important, an analysis of the model provides a satisfying explanation of the results obtained: BinEye is able to auto-learn each author’s coding style and thus characterize the authors of program binaries. We evaluated BinEye on large datasets extracted from selected open-source C++ projects in GitHub, Google Code Jam events, and several programming projects, comparing it wiexperimental results demonstrate that BinEye characterizes a larger number of authors with a significantly higher accuracy (above 90%). We also employed it in the context of several case studies. When applied to Zeus and Citadel, BinEye found that this pair might be associated with common authors. For other packages, BinEye demonstrated its ability to identify the presence of multiple authors in binary code.

Original languageEnglish
Title of host publicationComputer Security – ESORICS 2019 - 24th European Symposium on Research in Computer Security, Proceedings
EditorsKazue Sako, Steve Schneider, Peter Y.A. Ryan
PublisherSpringer
Pages47-67
Number of pages21
ISBN (Print)9783030299613
DOIs
Publication statusPublished - 2019
Event24th European Symposium on Research in Computer Security, ESORICS 2019 - Luxembourg, Luxembourg
Duration: Sept 23 2019Sept 27 2019

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume11736 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference24th European Symposium on Research in Computer Security, ESORICS 2019
Country/TerritoryLuxembourg
CityLuxembourg
Period9/23/199/27/19

ASJC Scopus subject areas

  • Theoretical Computer Science
  • General Computer Science

Fingerprint

Dive into the research topics of 'BinEye: Towards Efficient Binary Authorship Characterization Using Deep Learning'. Together they form a unique fingerprint.

Cite this