Abstract
Regression via classification (RvC) is a method in which a regression problem is converted into a classification problem. A discretization process is used to covert continuous target value to classes. The discretized data can be used with classifiers as a classification problem. In this paper, we use a discretization method, Extreme Randomized Discretization, in which bin boundaries are created randomly to create ensembles. We present an ensemble method for RvC problems. We show theoretically for a set of problems that if the number of bins is three, the proposed ensembles for RvC perform better than RvC with the equal-width discretization method. We use these results to show that infinite-sized ensembles, consisting of finite-sized decision trees, created by a pure randomized method (split points are created randomly), are not consistent. We also theoretically show, using a set of regression problems, that the performance of these ensembles is dependent on the size of member decision trees.
Original language | English |
---|---|
Pages (from-to) | 97-104 |
Number of pages | 8 |
Journal | Pattern Analysis and Applications |
Volume | 17 |
Issue number | 1 |
DOIs | |
Publication status | Published - Feb 2014 |
Externally published | Yes |
Keywords
- Consistency
- Decision trees
- Discretization
- Ensembles
- Randomization
ASJC Scopus subject areas
- Computer Vision and Pattern Recognition
- Artificial Intelligence