Abstract
The analysis of omics data together with knowledge-based interpretation can help obtaining important information regarding different biological processes and to reflect the current physiological status of tissue and cells. The main challenge, however, is to analyze high-dimensional gene expression data consisting of a massive amount of redundant genes in extracting disease-related information. To address this problem, gene selection, that eliminates redundant and irrelevant genes, has been a key step. In current article, a feature selection technique is proposed that exploit correlation based overlapping analysis of expression data across classes. The proposed correlation based overlapping score (COS) technique is compared with state-of-the-art gene selection approaches using real-world benchmark microarray datasets. In an experimental evaluation, the COS algorithm outperforms the other methods with minimum misclassification errors obtained via boosting, random forest and k-nearest neighbour (kNN) classifiers. Moreover, the proposed technique is more stable than the other techniques in gene selection.
| Original language | English |
|---|---|
| Article number | 103958 |
| Journal | Chemometrics and Intelligent Laboratory Systems |
| Volume | 199 |
| DOIs | |
| Publication status | Published - Apr 15 2020 |
| Externally published | Yes |
Keywords
- Classifiers
- Correlation based overlapping score
- Feature selection
- Gene expression data
- Stability index
ASJC Scopus subject areas
- Analytical Chemistry
- Software
- Computer Science Applications
- Process Chemistry and Technology
- Spectroscopy
Fingerprint
Dive into the research topics of 'Feature selection and classification for gene expression data using novel correlation based overlapping score method via Chou's 5-steps rule'. Together they form a unique fingerprint.Cite this
- APA
- Standard
- Harvard
- Vancouver
- Author
- BIBTEX
- RIS