Enhancing Diagnostic Accuracy by Bypassing Traditional Imputation and Leveraging Missing Data in Alzheimer’s Disease Detection Models

Hamzah Dabool, Hany Alashwal, Ahmad A. Moustafa

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Researchers often encounter significant hurdles when dealing with datasets that contain a vast number of missing values. This predicament forces them to make a tough choice: either discard a substantial amount of data, which could drastically undermine the accuracy of the machine learning (ML) model, or attempt to fill these missing values in sensitive medical datasets—a method that is far from ideal. This paper proposes an approach to this issue, suggesting that bypassing the traditional path of data imputation in favor of a model that learns from the missing values themselves could paradoxically improve the accuracy and predictive capabilities of Alzheimer’s Disease (AD) identification models. We introduce a comparison between state-of-the-art ML models and the XGBoost algorithm, which is designed to integrate the learning of missing values into its training cycle, using the official ADNI datasets with extensive missing values. The experiment further evaluates these models on the same datasets post-imputation. The results strikingly indicate that this unconventional strategy not only bridges the gaps created by missing data but also surpasses the accuracy of traditional methods that rely on filling in incomplete samples. This discovery opens up new avenues for research in medical diagnostics for conditions like AD, where data scarcity and imperfections are common. By rethinking how we handle incomplete data, we unlock new potential for refining ML applications in healthcare, particularly in enhancing the precision of diagnoses in complex diseases such as AD.

Original languageEnglish
Title of host publicationICISDM 2024 - 8th International Conference on Information System and Data Mining
PublisherAssociation for Computing Machinery
Pages33-38
Number of pages6
ISBN (Electronic)9798400717345
DOIs
Publication statusPublished - Nov 25 2024
Event8th International Conference on Information System and Data Mining, ICISDM 2024 - Los Angeles, United States
Duration: Jun 24 2024Jun 26 2024

Publication series

NameACM International Conference Proceeding Series

Conference

Conference8th International Conference on Information System and Data Mining, ICISDM 2024
Country/TerritoryUnited States
CityLos Angeles
Period6/24/246/26/24

Keywords

  • ADNI
  • Alzheimer’s Disease (AD)
  • Data Imputation
  • Extreme Gradient Boosting (XGBoost)
  • Machine Learning
  • Medical Diagnostics
  • Missing Values

ASJC Scopus subject areas

  • Human-Computer Interaction
  • Computer Networks and Communications
  • Computer Vision and Pattern Recognition
  • Software

Fingerprint

Dive into the research topics of 'Enhancing Diagnostic Accuracy by Bypassing Traditional Imputation and Leveraging Missing Data in Alzheimer’s Disease Detection Models'. Together they form a unique fingerprint.

Cite this