A pathway-based approach for analyzing microarray data using random forests

Chin Hui Shi, Mohd Saberi Mohamad, Safaai Deris, Zuwairie Ibrahim

Research output: Contribution to journalArticlepeer-review

Abstract

Although machine learning methods, such as random forests, have been developed to correlate survival outcomes with a set of genes, less study has assessed the abilities of these methods in incorporating pathway information for analyzing microarray data. In general, genes that are identified without incorporating biological knowledge are more difficult to interpret. Thus, the pathway-based survival analysis using machine learning methods represents a promising approach for generating new biological hypothesis from microarray studies. The two popular variants of random forests used in this research for survival data are random survival forests and bivariate node-splitting random survival forests. There are three types of datasets used for this research and each dataset with a three-level outcome. This research which compared the four splitting rules available in random survival forests to identify log-rank test is the most accurate in terms of prediction error. To evaluate the accuracy of pathway based survival approach, this research considered employing area under the receiver operating characteristic curve for censored data. The use of random survival forests for survival outcomes in analyzing microarray data allows researchers to obtain results that are more closely tied with the biological mechanism of diseases.

Original languageEnglish
Pages (from-to)1253-1257
Number of pages5
JournalICIC Express Letters
Volume6
Issue number5
Publication statusPublished - May 2012
Externally publishedYes

Keywords

  • Bivariate node-splitting random survival forests
  • Microarray data
  • Pathway
  • Random forests
  • Random survival forests
  • Survival outcomes

ASJC Scopus subject areas

  • Control and Systems Engineering
  • General Computer Science

Fingerprint

Dive into the research topics of 'A pathway-based approach for analyzing microarray data using random forests'. Together they form a unique fingerprint.

Cite this