Feature based techniques for auto-detection of novel email worms

Mohammad M. Masud, Latifur Khan, Bhavani Thuraisingham

Research output: Chapter in Book/Report/Conference proceedingConference contribution

10 Citations (Scopus)


This work focuses on applying data mining techniques to detect email worms. We apply a feature-based detection technique. These features are extracted using different statistical and behavioral analysis of emails sent over a certain period of time. The number of features thus extracted is too large. So, our goal is to select the best set of features that can efficiently distinguish between normal and viral emails using classification techniques. First, we apply Principal Component Analysis (PCA) to reduce the high dimensionality of data and to find a projected, optimal set of attributes. We observe that the application of PCA on a benchmark dataset improves the accuracy of detecting novel worms. Second, we apply J48 decision tree algorithm to determine the relative importance of features based on information gain. We are able to identify a subset of features, along with a set of classification rules that have a better performance in detecting novel worms than the original set of features or PCA-reduced features. Finally, we compare our results with published results and discuss our future plans to extend this work.

Original languageEnglish
Title of host publicationAdvances in Knowledge Discovery and Data Mining - 11th Pacific-Asia Conference, PAKDD 2007, Proceedings
PublisherSpringer Verlag
Number of pages12
ISBN (Print)9783540717003
Publication statusPublished - 2007
Externally publishedYes
Event11th Pacific-Asia Conference on Knowledge Discovery and Data Mining, PAKDD 2007 - Nanjing, China
Duration: May 22 2007May 25 2007

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume4426 LNAI
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349


Other11th Pacific-Asia Conference on Knowledge Discovery and Data Mining, PAKDD 2007


  • Classification technique
  • Data mining
  • Email worm
  • Feature selection
  • Principal component analysis

ASJC Scopus subject areas

  • Theoretical Computer Science
  • General Computer Science


Dive into the research topics of 'Feature based techniques for auto-detection of novel email worms'. Together they form a unique fingerprint.

Cite this