A feature selection method for classification within functional genomics experiments based on the proportional overlapping score

Osama Mahmoud, Andrew Harrison, Aris Perperoglou, Asma Gul, Zardad Khan, Metodi V. Metodiev, Berthold Lausen

Research output: Contribution to journalArticlepeer-review

45 Citations (Scopus)

Abstract

Background: Microarray technology, as well as other functional genomics experiments, allow simultaneous measurements of thousands of genes within each sample. Both the prediction accuracy and interpretability of a classifier could be enhanced by performing the classification based only on selected discriminative genes. We propose a statistical method for selecting genes based on overlapping analysis of expression data across classes. This method results in a novel measure, called proportional overlapping score (POS), of a feature's relevance to a classification task.Results: We apply POS, along-with four widely used gene selection methods, to several benchmark gene expression datasets. The experimental results of classification error rates computed using the Random Forest, k Nearest Neighbor and Support Vector Machine classifiers show that POS achieves a better performance.Conclusions: A novel gene selection method, POS, is proposed. POS analyzes the expressions overlap across classes taking into account the proportions of overlapping samples. It robustly defines a mask for each gene that allows it to minimize the effect of expression outliers. The constructed masks along-with a novel gene score are exploited to produce the selected subset of genes.

Original languageEnglish
Article number274
JournalBMC Bioinformatics
Volume15
Issue number1
DOIs
Publication statusPublished - Aug 11 2014
Externally publishedYes

Keywords

  • Feature selection
  • Gene mask
  • Gene ranking
  • Microarray classification
  • Minimum subset of genes
  • Proportional overlap score

ASJC Scopus subject areas

  • Structural Biology
  • Biochemistry
  • Molecular Biology
  • Computer Science Applications
  • Applied Mathematics

Fingerprint

Dive into the research topics of 'A feature selection method for classification within functional genomics experiments based on the proportional overlapping score'. Together they form a unique fingerprint.

Cite this