TY - GEN
T1 - A hybrid model to detect malicious executables
AU - Masud, Mohammad M.
AU - Khan, Latifur
AU - Thuraisingham, Bhavani
PY - 2007/12/1
Y1 - 2007/12/1
N2 - We present a hybrid data mining approach to detect malicious executables. In this approach we identify important features of the malicious and benign executables. These features are used by a classifier to learn a classification model that can distinguish between malicious and benign executables. We construct a novel combination of three different kinds of features: binary n-grams, assembly n-grams, and library function calls. Binary features are extracted from the binary executables, whereas assembly features are extracted from the disassembled executables. The function call features are extracted from the program headers. We also propose an efficient and scalable feature extraction technique. We apply our model on a large corpus of real benign and malicious executables. We extract the abovementioned features from the data and train a classifier using Support Vector Machine. This classifier achieves a very high accuracy and low false positive rate in detecting malicious executables. Our model is compared with other feature-based approaches, and found to be more efficient in terms of detection accuracy and false alarm rate.
AB - We present a hybrid data mining approach to detect malicious executables. In this approach we identify important features of the malicious and benign executables. These features are used by a classifier to learn a classification model that can distinguish between malicious and benign executables. We construct a novel combination of three different kinds of features: binary n-grams, assembly n-grams, and library function calls. Binary features are extracted from the binary executables, whereas assembly features are extracted from the disassembled executables. The function call features are extracted from the program headers. We also propose an efficient and scalable feature extraction technique. We apply our model on a large corpus of real benign and malicious executables. We extract the abovementioned features from the data and train a classifier using Support Vector Machine. This classifier achieves a very high accuracy and low false positive rate in detecting malicious executables. Our model is compared with other feature-based approaches, and found to be more efficient in terms of detection accuracy and false alarm rate.
KW - Disassembly
KW - Feature extraction
KW - Malicious executable
KW - N-gram analysis
UR - http://www.scopus.com/inward/record.url?scp=38549122470&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=38549122470&partnerID=8YFLogxK
U2 - 10.1109/ICC.2007.242
DO - 10.1109/ICC.2007.242
M3 - Conference contribution
AN - SCOPUS:38549122470
SN - 1424403537
SN - 9781424403530
T3 - IEEE International Conference on Communications
SP - 1443
EP - 1448
BT - 2007 IEEE International Conference on Communications, ICC'07
T2 - 2007 IEEE International Conference on Communications, ICC'07
Y2 - 24 June 2007 through 28 June 2007
ER -