Ensemble Learning with Resampling for Imbalanced Data

Firuz Kamalov, Ashraf Elnagar, Ho Hon Leung

Research output: Chapter in Book/Report/Conference proceedingConference contribution

3 Citations (Scopus)

Abstract

Imbalanced class distribution is an issue that appears in various applications. In this paper, we undertake a comprehensive study of the effects of sampling on the performance of bootstrap aggregating in the context of imbalanced data. Concretely, we carry out a comparison of sampling methods applied to single and ensemble classifiers. The experiments are conducted on simulated and real-life data using a range of sampling methods. The contributions of the paper are twofold: i) demonstrate the effectiveness of ensemble techniques based on resampled data over a single base classifier and ii) compare the effectiveness of different resampling techniques when used during the bagging stage for ensemble classifiers. The results reveal that ensemble methods overwhelmingly outperform single classifiers based on resampled data. In addition, we discover that NearMiss and random oversampling (ROS) are the optimal sampling algorithms for ensemble learning.

Original languageEnglish
Title of host publicationIntelligent Computing Theories and Application - 17th International Conference, ICIC 2021, Proceedings
EditorsDe-Shuang Huang, Kang-Hyun Jo, Jianqiang Li, Valeriya Gribova, Abir Hussain
PublisherSpringer Science and Business Media Deutschland GmbH
Pages564-578
Number of pages15
ISBN (Print)9783030845285
DOIs
Publication statusPublished - 2021
Event17th International Conference on Intelligent Computing, ICIC 2021 - Shenzhen, China
Duration: Aug 12 2021Aug 15 2021

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume12837 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference17th International Conference on Intelligent Computing, ICIC 2021
Country/TerritoryChina
CityShenzhen
Period8/12/218/15/21

Keywords

  • Data preprocessing sampling
  • Ensemble method
  • Imbalanced data
  • Oversampling
  • Undersampling

ASJC Scopus subject areas

  • Theoretical Computer Science
  • General Computer Science

Fingerprint

Dive into the research topics of 'Ensemble Learning with Resampling for Imbalanced Data'. Together they form a unique fingerprint.

Cite this