Lacking labels in the stream: Classifying evolving stream data with few labels

Clay Woolam, Mohammad M. Masud, Latifur Khan

Research output: Chapter in Book/Report/Conference proceedingConference contribution

21 Citations (Scopus)

Abstract

This paper outlines a data stream classification technique that addresses the problem of insufficient and biased labeled data. It is practical to assume that only a small fraction of instances in the stream are labeled. A more practical assumption would be that the labeled data may not be independently distributed among all training documents. How can we ensure that a good classification model would be built in these scenarios, considering that the data stream also has evolving nature? In our previous work we applied semi-supervised clustering to build classification models using limited amount of labeled training data. However, it assumed that the data to be labeled should be chosen randomly. In our current work, we relax this assumption, and propose a label propagation framework for data streams that can build good classification models even if the data are not labeled randomly. Comparison with state-of-the-art stream classification techniques on synthetic and benchmark real data proves the effectiveness of our approach.

Original languageEnglish
Title of host publicationFoundations of Intelligent Systems - 18th International Symposium, ISMIS 2009, Proceedings
Pages552-562
Number of pages11
DOIs
Publication statusPublished - 2009
Externally publishedYes
Event18th International Symposium on Methodologies for Intelligent Systems, ISMIS 2009 - Prague, Czech Republic
Duration: Sept 14 2009Sept 17 2009

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume5722 LNAI
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Other

Other18th International Symposium on Methodologies for Intelligent Systems, ISMIS 2009
Country/TerritoryCzech Republic
CityPrague
Period9/14/099/17/09

ASJC Scopus subject areas

  • Theoretical Computer Science
  • Computer Science(all)

Fingerprint

Dive into the research topics of 'Lacking labels in the stream: Classifying evolving stream data with few labels'. Together they form a unique fingerprint.

Cite this