Abstract
Regression via classification (RvC) is a method in which a regression problem is converted into a classification problem. A discretization process is used to covert continuous target value to classes. The discretized data can be used with classifiers as a classification problem. In this paper, we use a discretization method, Extreme Randomized Discretization, in which bin boundaries are created randomly to create ensembles. We present an ensemble method for RvC problems. We show theoretically for a set of problems that if the number of bins is three, the proposed ensembles for RvC perform better than RvC with the equal-width discretization method. We use these results to show that infinite-sized ensembles, consisting of finite-sized decision trees, created by a pure randomized method (split points are created randomly), are not consistent. We also theoretically show, using a set of regression problems, that the performance of these ensembles is dependent on the size of member decision trees.
| Original language | English |
|---|---|
| Pages (from-to) | 97-104 |
| Number of pages | 8 |
| Journal | Pattern Analysis and Applications |
| Volume | 17 |
| Issue number | 1 |
| DOIs | |
| Publication status | Published - Feb 2014 |
| Externally published | Yes |
Keywords
- Consistency
- Decision trees
- Discretization
- Ensembles
- Randomization
ASJC Scopus subject areas
- Computer Vision and Pattern Recognition
- Artificial Intelligence