Margin weighted robust discriminant score for feature selection in imbalanced gene expression classification

Sheema Gul, Dost Muhammad Khan, Saeed Aldahmani, Zardad Khan

Research output: Contribution to journalArticlepeer-review

Abstract

High-dimensional gene expression data poses significant challenges for binary classification, particularly in the context of feature selection methods. Conventional methods, for example, Proportional Overlap Score, Wilcoxon Rank-Sum Test, Weighted Signal to Noise Ratio, ensemble Minimum Redundancy and Maximum Relevance, Fisher Score and Robust Weighted Score for unbalanced data are impacted by key challenges, such as, class imbalance and redundancy. To mitigate these issues, customized feature selection methods are required to tackle the class imbalance issue. This study proposes a more robust solution, Margin Weighted Robust Discriminant Score, for feature selection in the context of high dimensional imbalanced problems. MW-RDS integrates a minority amplification factor to ensure the impact of minority class observation during feature ranking process. The amplification factor along with class specific stability weights obtained from minority-focused robust discriminant score are used for achieving maximum differential capability of genes/features. The score is weighted by margin weights extracted from support vectors to enhance the discriminative power of genes/features thereby highlighting its potential for class separation. Finally, top-ranked genes/features are constrained using ℓ1-regularization to discard redundant genes while identifying the most significant ones. The performance of the proposed method is tested on 9 openly accessible gene expression datasets, using Random Forest, Support Vector Machines, and Weighted k Nearest Neighbors classifiers in term of performance metrics, i.e., accuracy, sensitivity, specificity, F1-score, and precision. The results reveal that the proposed method outperforms the existing methods in most of the cases. Boxplots and stability-plots are also generated to gain a deeper understanding of the results. To futher assess the efficacy of the proposed method, the paper also gives a detailed simulation study.

Original languageEnglish
Article numbere0325147
JournalPLoS ONE
Volume20
Issue number6 June
DOIs
Publication statusPublished - Jun 2025

ASJC Scopus subject areas

  • General

Fingerprint

Dive into the research topics of 'Margin weighted robust discriminant score for feature selection in imbalanced gene expression classification'. Together they form a unique fingerprint.

Cite this