TY - GEN
T1 - BinDeep
T2 - 20th IEEE International Conference on Trust, Security and Privacy in Computing and Communications, TrustCom 2021
AU - Alrabaee, Saed
AU - Choo, Kim Kwang Raymond
AU - Qbea'h, Mohammad
AU - Khasawneh, Mahmoud
N1 - Funding Information:
We are grateful to the anonymous reviewers for their comments. The first author is partially supported by the United Arab Emirates University Start-up Grant G000 03261.
Publisher Copyright:
© 2021 IEEE.
PY - 2021
Y1 - 2021
N2 - Mapping a binary function taken from a compiled binary to the same function in the original source code has many security applications, such as discovering reused free open source code in malware binaries. To facilitate malware analysis, we present BINDEEP, a framework that learns the semantic relationships among binary functions based on assembly code. It also learns semantic information about the source functions in order to carry out function matching. We demonstrate how BINDEEP can be applied to fingerprint the origin of functions in malware binaries, and then benchmark its performance against that of five competing systems (i.e., RESOURCE, the Binary Analysis Tool (BAT), BinPro, Statistical Machine Translation (SMT), and FOSSIL). The findings show that BINDEEP is more robust and achieves significant improvement over these existing systems when confronted with changes introduced by code transformation methods or the use of different compilers and optimization levels. Furthermore, BINDEEP is able to discover source packages in malware binaries, such as Zeus and Citadel, that match those listed in existing security reports.
AB - Mapping a binary function taken from a compiled binary to the same function in the original source code has many security applications, such as discovering reused free open source code in malware binaries. To facilitate malware analysis, we present BINDEEP, a framework that learns the semantic relationships among binary functions based on assembly code. It also learns semantic information about the source functions in order to carry out function matching. We demonstrate how BINDEEP can be applied to fingerprint the origin of functions in malware binaries, and then benchmark its performance against that of five competing systems (i.e., RESOURCE, the Binary Analysis Tool (BAT), BinPro, Statistical Machine Translation (SMT), and FOSSIL). The findings show that BINDEEP is more robust and achieves significant improvement over these existing systems when confronted with changes introduced by code transformation methods or the use of different compilers and optimization levels. Furthermore, BINDEEP is able to discover source packages in malware binaries, such as Zeus and Citadel, that match those listed in existing security reports.
KW - binary code
KW - machine learning
KW - malicious code
UR - http://www.scopus.com/inward/record.url?scp=85127431672&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85127431672&partnerID=8YFLogxK
U2 - 10.1109/TrustCom53373.2021.00150
DO - 10.1109/TrustCom53373.2021.00150
M3 - Conference contribution
AN - SCOPUS:85127431672
T3 - Proceedings - 2021 IEEE 20th International Conference on Trust, Security and Privacy in Computing and Communications, TrustCom 2021
SP - 1100
EP - 1107
BT - Proceedings - 2021 IEEE 20th International Conference on Trust, Security and Privacy in Computing and Communications, TrustCom 2021
A2 - Zhao, Liang
A2 - Kumar, Neeraj
A2 - Hsu, Robert C.
A2 - Zou, Deqing
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 20 October 2021 through 22 October 2021
ER -