Consistency of randomized and finite sized decision tree ensembles

Amir Ahmad, Sami M. Halawani, Ibrahim A. Albidewi

Research output: Contribution to journalArticlepeer-review

3 Citations (Scopus)

Abstract

Regression via classification (RvC) is a method in which a regression problem is converted into a classification problem. A discretization process is used to covert continuous target value to classes. The discretized data can be used with classifiers as a classification problem. In this paper, we use a discretization method, Extreme Randomized Discretization, in which bin boundaries are created randomly to create ensembles. We present an ensemble method for RvC problems. We show theoretically for a set of problems that if the number of bins is three, the proposed ensembles for RvC perform better than RvC with the equal-width discretization method. We use these results to show that infinite-sized ensembles, consisting of finite-sized decision trees, created by a pure randomized method (split points are created randomly), are not consistent. We also theoretically show, using a set of regression problems, that the performance of these ensembles is dependent on the size of member decision trees.

Original languageEnglish
Pages (from-to)97-104
Number of pages8
JournalPattern Analysis and Applications
Volume17
Issue number1
DOIs
Publication statusPublished - Feb 2014
Externally publishedYes

Keywords

  • Consistency
  • Decision trees
  • Discretization
  • Ensembles
  • Randomization

ASJC Scopus subject areas

  • Computer Vision and Pattern Recognition
  • Artificial Intelligence

Fingerprint

Dive into the research topics of 'Consistency of randomized and finite sized decision tree ensembles'. Together they form a unique fingerprint.

Cite this