Random Ordinality Ensembles: Ensemble methods for multi-valued categorical data

Amir Ahmad, Gavin Brown

Research output: Contribution to journalArticlepeer-review

9 Citations (Scopus)


Data with multi-valued categorical attributes can cause major problems for decision trees. The high branching factor can lead to data fragmentation, where decisions have little or no statistical support. In this paper, we propose a new ensemble method, Random Ordinality Ensembles (ROE), that reduces this problem, and provides significantly improved accuracies over current ensemble methods. We perform a random projection of the categorical data into a continuous space. As the transformation to continuous data is a random process, each dataset has a different imposed ordinality. A decision tree that learns on this new continuous space is able to use binary splits, hence reduces the data fragmentation problem. Generally, these binary trees are accurate. Diverse training datasets ensure diverse decision trees in the ensemble. We created two variants of the technique, ROE. In the first variant, we used decision trees as the base models for ensembles. In the second variant, we combined the attribute randomisation of Random Subspaces with Random Ordinality. These methods match or outperform other popular ensemble methods. Different properties of these ensembles were studied. The study suggests that random ordinality trees are generally more accurate and smaller than multi-way split decision trees. It is also shown that random ordinality attributes can be used to improve Bagging and AdaBoost. M1 ensemble methods.

Original languageEnglish
Pages (from-to)75-94
Number of pages20
JournalInformation Sciences
Issue number1
Publication statusPublished - 2015
Externally publishedYes


  • Binary split
  • Categorical data
  • Classifier ensemble
  • Decision tree
  • Multi-way split

ASJC Scopus subject areas

  • Software
  • Control and Systems Engineering
  • Theoretical Computer Science
  • Computer Science Applications
  • Information Systems and Management
  • Artificial Intelligence


Dive into the research topics of 'Random Ordinality Ensembles: Ensemble methods for multi-valued categorical data'. Together they form a unique fingerprint.

Cite this