Abstract
The pathway-based microarray classification approach leads to a new era of genomic research. However, this approach is limited by the issues in quality of pathway data. Usually the pathway data are curated from biological literatures and in specific biological experiment (e.g., lung cancer experiment), context free pathway information collection process takes place leading to the presence of uninformative genes in the pathways. Many methods in this approach neglect these limitations by treating all genes in a pathway as significant. In this paper, we proposed a hybrid of support vector machine and smoothly clipped absolute deviation with group-specific tuning parameters (gSVM-SCAD) to select informative genes within pathways before the pathway evaluation process. Our experiment on canine, gender and lung cancer datasets shows that gSVM-SCAD obtains significant results in identifying significant genes and pathways and in classification accuracy.
Original language | English |
---|---|
Pages (from-to) | 146-161 |
Number of pages | 16 |
Journal | International Journal of Data Mining and Bioinformatics |
Volume | 10 |
Issue number | 2 |
DOIs | |
Publication status | Published - 2014 |
Externally published | Yes |
Keywords
- Bioinformatics
- Data mining
- Gene selection
- Smoothly clipped absolute deviation
- Support vector machines
ASJC Scopus subject areas
- Information Systems
- General Biochemistry,Genetics and Molecular Biology
- Library and Information Sciences