Optimal Causal Decision Trees Ensemble for Improved Prediction and Causal Inference

Neelam Younas, Amjad Ali, Hafsa Hina, Muhammad Hamraz, Zardad Khan, Saeed Aldahmani

Research output: Contribution to journalArticlepeer-review

3 Citations (Scopus)

Abstract

Ensemble methods can be used to identify causal relationships in data for a better understanding and taking the right decision in processes that involve high risk. This paper explores the idea of a causal decision tree forest and proposes a regularized ensemble method by integrating optimal causal trees for improved prediction accuracy while not compromising on accurately estimating heterogeneous treatment effects. The proposed method is based on selecting a subset of the most accurate causal trees from a sufficiently large pool based on their out-of-sample error estimates. The selected trees are integrated to form an ensemble that is used for estimating heterogeneous treatment effect and predicting unseen data. The proposed method is applied on Pakistan's income function consisting of 27964 observations on wages of workers age 10 and above as an example dataset. The paper gives a detailed simulation study where datasets are generated under 5 different designs. The proposed method is assessed against ordinary least square (OLS), least absolute shrinkage and selection operator (LASSO), Ridge, Causal Tree and the standard decision trees forest (i.e. the causal forest) via mean square error (MSE), root mean square error (RMSE), mean absolute deviation (MAD) and Pearson correlation ( r ) as performance metrics. The analyses given in the paper reveal that the proposed method can be used effectively for estimating heterogeneous treatment effects and achieves better prediction performance and as compared to the rest of the methods given in the paper.

Original languageEnglish
Pages (from-to)13000-13011
Number of pages12
JournalIEEE Access
Volume10
DOIs
Publication statusPublished - 2022

Keywords

  • Causal inference
  • causal decision tree
  • causal random forest
  • ensemble learning
  • heterogeneous treatment effect
  • random forest

ASJC Scopus subject areas

  • General Computer Science
  • General Materials Science
  • General Engineering

Fingerprint

Dive into the research topics of 'Optimal Causal Decision Trees Ensemble for Improved Prediction and Causal Inference'. Together they form a unique fingerprint.

Cite this