TY - GEN
T1 - A study of random linear oracle ensembles
AU - Ahmad, Amir
AU - Brown, Gavin
PY - 2009
Y1 - 2009
N2 - Random Linear Oracle (RLO) ensembles of Naive Bayes classifiers show excellent performance [12]. In this paper, we investigate the reasons for the success of RLO ensembles. Our study suggests that the decomposition of most of the classes of the dataset into two subclasses for each class is the reason for the success of the RLO method. Our study leads to the development of a new output manipulation based ensemble method; Random Subclasses (RS). In the proposed method, we create new subclasses from each subset of data points that belongs to the same class using RLO framework and consider each subclass as a class of its own. The comparative study suggests that RS is similar to RLO method, whereas RS is statistically better than or similar to Bagging and AdaBoost.M1 for most of the datasets. The similar performance of RLO and RS suggest that the creation of local structures (subclasses) is the main reason for the success of RLO. The another conclusion of this study is that RLO is more useful for classifiers (linear classifiers etc.) that have limited flexibility in their class boundaries. These classifiers can not learn complex class boundaries. Creating subclasses makes new, easier to learn, class boundaries.
AB - Random Linear Oracle (RLO) ensembles of Naive Bayes classifiers show excellent performance [12]. In this paper, we investigate the reasons for the success of RLO ensembles. Our study suggests that the decomposition of most of the classes of the dataset into two subclasses for each class is the reason for the success of the RLO method. Our study leads to the development of a new output manipulation based ensemble method; Random Subclasses (RS). In the proposed method, we create new subclasses from each subset of data points that belongs to the same class using RLO framework and consider each subclass as a class of its own. The comparative study suggests that RS is similar to RLO method, whereas RS is statistically better than or similar to Bagging and AdaBoost.M1 for most of the datasets. The similar performance of RLO and RS suggest that the creation of local structures (subclasses) is the main reason for the success of RLO. The another conclusion of this study is that RLO is more useful for classifiers (linear classifiers etc.) that have limited flexibility in their class boundaries. These classifiers can not learn complex class boundaries. Creating subclasses makes new, easier to learn, class boundaries.
KW - Classifier Ensemble
KW - Clusters
KW - Naive Bayes
KW - Subclasses
UR - http://www.scopus.com/inward/record.url?scp=70349309670&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=70349309670&partnerID=8YFLogxK
U2 - 10.1007/978-3-642-02326-2_49
DO - 10.1007/978-3-642-02326-2_49
M3 - Conference contribution
AN - SCOPUS:70349309670
SN - 3642023258
SN - 9783642023255
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 488
EP - 497
BT - Multiple Classifier Systems - 8th International Workshop, MCS 2009, Proceedings
T2 - 8th International Workshop on Multiple Classifier Systems, MCS 2009
Y2 - 10 June 2009 through 12 June 2009
ER -