How accurate are the machine learning models in improving monthly rainfall prediction in hyper arid environment?

Faisal Baig, Luqman Ali, Muhammad Abrar Faiz, Haonan Chen, Mohsen Sherif

Research output: Contribution to journalArticlepeer-review


Arid regions like the United Arab Emirates (UAE) face a dire challenge of scarce water resources and unpredictable climate patterns. This study investigates the efficacy of advanced Machine Learning (ML) techniques in enhancing rainfall prediction within hyper-arid environments. Leveraging an extensive 30-year dataset from 1991 to 2020, this study harnessed the power of XGBoost, LSTM, Random Forest (RF), Gradient Boost (GB), Support Vector Machine (SVM), Multilayer Perceptron (MLP), Linear Regression (LR), and ensemble methods to significantly enhance the prediction accuracy of monthly rainfall over UAE. In the initial univariate analysis, focused solely on rainfall as the predictor, the ML models displayed encouraging performance during the training phase, achieving an impressive correlation coefficient (CC) of 0.88 for both XGBoost and the ensemble models. However, their predictive efficacy witnessed a decline during the testing phase, where the maximum CC reached 0.45. In contrast, traditional models like Linear Regression and SVM, yielded subpar results in both training and testing, exhibiting correlation values lower than 0.3. To address these limitations, a multivariate analysis is conducted by incorporating additional meteorological parameters, including wind speed, temperature, humidity, and evapotranspiration. This augmentation proved highly beneficial as it substantially enhanced the models' predictive capacities during the testing period. The XGB achieves a CC of 0.76, LSTM improves from 0.21 to 0.71, and stacked models exhibit promising behavior jumping from an average of 0.44 to 0.82 during the testing periods. Additionally, we performed a sensitivity analysis utilizing LASSO regression, which revealed that wind speed and minimum temperature emerged as the most influential parameters for monthly rainfall prediction in the arid context. These two meteorological factors exerted a substantial impact on the accuracy of our predictive models, underscoring their significance in understanding and forecasting rainfall patterns in hyper-arid regions, such as the United Arab Emirates. The identification of these key drivers further strengthens the foundation for effective water resource management and climate adaptation strategies in such challenging environments. This study provides valuable insights for water resource planning, agriculture, and climate resilience strategies in hyper-arid regions. Further research can build upon these results to enhance rainfall prediction models and support sustainable development in arid regions.

Original languageEnglish
Article number131040
JournalJournal of Hydrology
Publication statusPublished - Apr 2024


  • Arid regions
  • Machine learning
  • Rainfall forecasting
  • Sensitivity analysis

ASJC Scopus subject areas

  • Water Science and Technology


Dive into the research topics of 'How accurate are the machine learning models in improving monthly rainfall prediction in hyper arid environment?'. Together they form a unique fingerprint.

Cite this