Ensemble of subset of k-nearest neighbours models for class membership probability estimation

Asma Gul, Zardad Khan, Aris Perperoglou, Osama Mahmoud, Miftahuddin Miftahuddin, Werner Adler, Berthold Lausen

Research output: Chapter in Book/Report/Conference proceedingConference contribution

3 Citations (Scopus)

Abstract

Combining multiple classifiers can give substantial improvement in prediction performance of learning algorithms especially in the presence of noninformative features in the data sets. This technique can also be used for estimating class membership probabilities. We propose an ensemble of k-Nearest Neighbours (kNN) classifiers for class membership probability estimation in the presence of non-informative features in the data. This is done in two steps. Firstly, we select classifiers based upon their individual performance from a set of base kNN models, each generated on a bootstrap sample using a random feature set from the feature space of training data. Secondly, a step wise selection is used on the selected learners, and those models are added to the ensemble that maximize its predictive performance. We use bench mark data sets with some added non-informative features for the evaluation of our method. Experimental comparison of the proposed method with usual kNN, bagged kNN, random kNN and random forest shows that it leads to high predictive performance in terms of minimum Brier score on most of the data sets. The results are also verified by simulation studies.

Original languageEnglish
Title of host publicationAnalysis of Large and Complex Data
EditorsAdalbert F.X. Wilhelm, Hans A. Kestler
PublisherKluwer Academic Publishers
Pages411-421
Number of pages11
ISBN (Print)9783319252247
DOIs
Publication statusPublished - 2016
Externally publishedYes
Event2nd European Conference on Data Analysis, ECDA 2014 - Bremen, Germany
Duration: Jul 2 2014Jul 4 2014

Publication series

NameStudies in Classification, Data Analysis, and Knowledge Organization
ISSN (Print)1431-8814

Conference

Conference2nd European Conference on Data Analysis, ECDA 2014
Country/TerritoryGermany
CityBremen
Period7/2/147/4/14

ASJC Scopus subject areas

  • Computer Science Applications
  • Information Systems
  • Information Systems and Management
  • Analysis

Fingerprint

Dive into the research topics of 'Ensemble of subset of k-nearest neighbours models for class membership probability estimation'. Together they form a unique fingerprint.

Cite this