Random forest for gene selection and microarray data classification

Kohbalan Moorthy, Mohd Saberi Mohamad

Research output: Chapter in Book/Report/Conference proceedingConference contribution

14 Citations (Scopus)

Abstract

A random forest method has been selected to perform both gene selection and classification of the microarray data. The goal of this research is to develop and improve the random forest gene selection method. Hence, improved gene selection method using random forest has been proposed to obtain the smallest subset of genes as well as biggest subset of genes prior to classification. In this research, ten datasets that consists of different classes are used, which are Adenocarcinoma, Brain, Breast (Class 2 and 3), Colon, Leukemia, Lymphoma, NCI60, Prostate and Small Round Blue-Cell Tumor (SRBCT). Enhanced random forest gene selection has performed better in terms of selecting the smallest subset as well as biggest subset of informative genes through gene selection. Furthermore, the classification performed on the selected subset of genes using random forest has lead to lower prediction error rates compared to existing method and other similar available methods.

Original languageEnglish
Title of host publicationKnowledge Technology - Third Knowledge Technology Week, KTW 2011, Revised Selected Papers
Pages174-183
Number of pages10
DOIs
Publication statusPublished - 2012
Externally publishedYes
Event3rd Knowledge Technology Week, KTW 2011 - Kajang, Malaysia
Duration: Jul 18 2011Jul 22 2011

Publication series

NameCommunications in Computer and Information Science
Volume295 CCIS
ISSN (Print)1865-0929

Conference

Conference3rd Knowledge Technology Week, KTW 2011
Country/TerritoryMalaysia
CityKajang
Period7/18/117/22/11

Keywords

  • cancer classification
  • classification
  • gene expression data
  • gene selection
  • microarray data
  • Random forest

ASJC Scopus subject areas

  • General Computer Science
  • General Mathematics

Fingerprint

Dive into the research topics of 'Random forest for gene selection and microarray data classification'. Together they form a unique fingerprint.

Cite this