Feature selection and classification for gene expression data using novel correlation based overlapping score method via Chou's 5-steps rule

Abdul Wahid, Dost Muhammad Khan, Nadeem Iqbal, Sajjad Ahmad Khan, Amjad Ali, Mukhtaj Khan, Zardad Khan

Research output: Contribution to journalArticlepeer-review

28 Citations (Scopus)

Abstract

The analysis of omics data together with knowledge-based interpretation can help obtaining important information regarding different biological processes and to reflect the current physiological status of tissue and cells. The main challenge, however, is to analyze high-dimensional gene expression data consisting of a massive amount of redundant genes in extracting disease-related information. To address this problem, gene selection, that eliminates redundant and irrelevant genes, has been a key step. In current article, a feature selection technique is proposed that exploit correlation based overlapping analysis of expression data across classes. The proposed correlation based overlapping score (COS) technique is compared with state-of-the-art gene selection approaches using real-world benchmark microarray datasets. In an experimental evaluation, the COS algorithm outperforms the other methods with minimum misclassification errors obtained via boosting, random forest and k-nearest neighbour (kNN) classifiers. Moreover, the proposed technique is more stable than the other techniques in gene selection.

Original languageEnglish
Article number103958
JournalChemometrics and Intelligent Laboratory Systems
Volume199
DOIs
Publication statusPublished - Apr 15 2020
Externally publishedYes

Keywords

  • Classifiers
  • Correlation based overlapping score
  • Feature selection
  • Gene expression data
  • Stability index

ASJC Scopus subject areas

  • Analytical Chemistry
  • Software
  • Computer Science Applications
  • Process Chemistry and Technology
  • Spectroscopy

Fingerprint

Dive into the research topics of 'Feature selection and classification for gene expression data using novel correlation based overlapping score method via Chou's 5-steps rule'. Together they form a unique fingerprint.

Cite this