Multiclass prediction for cancer microarray data using various variables range selection based on random forest

Kohbalan Moorthy, Mohd Saberi Mohamad, Safaai Deris

Research output: Chapter in Book/Report/Conference proceedingConference contribution

3 Citations (Scopus)

Abstract

Continuous data mining has led to the generation of multi class datasets through microarray technology. New improved algorithms are then required to process and interpret these data. Cancer prediction tailored with variable selection process has shown to improve the overall prediction accuracy. Through variable selection process, the amount of informative genes gathered are much lesser than the initial data, yet the selective subset present in other methods cannot be fine-tuned to suit the necessity for particular number of variables. Hence, an improved technique of various variable range selection based on Random Forest method is proposed to allow selective variable subsets for cancer prediction. Our results indicate improvement in the overall prediction accuracy of cancer data based on the improved various variable range selection technique which allows selective variable selection to create best subset of genes. Moreover, this technique can assist in variable interaction analysis, gene network analysis, gene-ranking analysis and many other related fields.

Original languageEnglish
Title of host publicationTrends and Applications in Knowledge Discovery and Data Mining - PAKDD 2013 International Workshops
Subtitle of host publicationDMApps, DANTH, QIMIE, BDM, CDA, CloudSD, Revised Selected Papers
Pages247-257
Number of pages11
DOIs
Publication statusPublished - 2013
Externally publishedYes
Event17th Pacific-Asia Conference on Knowledge Discovery and Data Mining, PAKDD 2013 - Gold Coast, QLD, Australia
Duration: Apr 14 2013Apr 17 2013

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume7867 LNAI
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference17th Pacific-Asia Conference on Knowledge Discovery and Data Mining, PAKDD 2013
Country/TerritoryAustralia
CityGold Coast, QLD
Period4/14/134/17/13

Keywords

  • Cancer prediction
  • Gene expression
  • Microarray data
  • Random forest
  • Variable selection

ASJC Scopus subject areas

  • Theoretical Computer Science
  • General Computer Science

Fingerprint

Dive into the research topics of 'Multiclass prediction for cancer microarray data using various variables range selection based on random forest'. Together they form a unique fingerprint.

Cite this