Integrating novel class detection with classification for concept-drifting data streams

Mohammad M. Masud, Jing Gao, Latifur Khan, Jiawei Han, Bhavani Thuraisingham

Research output: Chapter in Book/Report/Conference proceedingConference contribution

61 Citations (Scopus)

Abstract

In a typical data stream classification task, it is assumed that the total number of classes are fixed. This assumption may not be valid in a real streaming environment, where new classes may evolve. Traditional data stream classification techniques are not capable of recognizing novel class instances until the appearance of the novel class is manually identified, and labeled instances of that class are presented to the learning algorithm for training. The problem becomes more challenging in the presence of concept-drift, when the underlying data distribution changes over time. We propose a novel and efficient technique that can automatically detect the emergence of a novel class in the presence of concept-drift by quantifying cohesion among unlabeled test instances, and separation of the test instances from training instances. Our approach is non-parametric, meaning, it does not assume any underlying distributions of data. Comparison with the state-of-the-art stream classification techniques prove the superiority of our approach.

Original languageEnglish
Title of host publicationMachine Learning and Knowledge Discovery in Databases - European Conference, ECML PKDD 2009, Proceedings
PublisherSpringer Verlag
Pages79-94
Number of pages16
EditionPART 2
ISBN (Print)3642041736, 9783642041730
DOIs
Publication statusPublished - 2009
Externally publishedYes
EventEuropean Conference on Machine Learning and Knowledge Discovery in Databases, ECML PKDD 2009 - Bled, Slovenia
Duration: Sept 7 2009Sept 11 2009

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
NumberPART 2
Volume5782 LNAI
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Other

OtherEuropean Conference on Machine Learning and Knowledge Discovery in Databases, ECML PKDD 2009
Country/TerritorySlovenia
CityBled
Period9/7/099/11/09

ASJC Scopus subject areas

  • Theoretical Computer Science
  • Computer Science(all)

Fingerprint

Dive into the research topics of 'Integrating novel class detection with classification for concept-drifting data streams'. Together they form a unique fingerprint.

Cite this