Abstract
Researchers often encounter significant hurdles when dealing with datasets that contain a vast number of missing values. This predicament forces them to make a tough choice: either discard a substantial amount of data, which could drastically undermine the accuracy of the machine learning (ML) model, or attempt to fill these missing values in sensitive medical datasets—a method that is far from ideal. This paper proposes an approach to this issue, suggesting that bypassing the traditional path of data imputation in favor of a model that learns from the missing values themselves could paradoxically improve the accuracy and predictive capabilities of Alzheimer’s Disease (AD) identification models. We introduce a comparison between state-of-the-art ML models and the XGBoost algorithm, which is designed to integrate the learning of missing values into its training cycle, using the official ADNI datasets with extensive missing values. The experiment further evaluates these models on the same datasets post-imputation. The results strikingly indicate that this unconventional strategy not only bridges the gaps created by missing data but also surpasses the accuracy of traditional methods that rely on filling in incomplete samples. This discovery opens up new avenues for research in medical diagnostics for conditions like AD, where data scarcity and imperfections are common. By rethinking how we handle incomplete data, we unlock new potential for refining ML applications in healthcare, particularly in enhancing the precision of diagnoses in complex diseases such as AD.
| Original language | English |
|---|---|
| Title of host publication | ICISDM 2024 - 8th International Conference on Information System and Data Mining |
| Publisher | Association for Computing Machinery |
| Pages | 33-38 |
| Number of pages | 6 |
| ISBN (Electronic) | 9798400717345 |
| DOIs | |
| Publication status | Published - Nov 25 2024 |
| Event | 8th International Conference on Information System and Data Mining, ICISDM 2024 - Los Angeles, United States Duration: Jun 24 2024 → Jun 26 2024 |
Publication series
| Name | ACM International Conference Proceeding Series |
|---|
Conference
| Conference | 8th International Conference on Information System and Data Mining, ICISDM 2024 |
|---|---|
| Country/Territory | United States |
| City | Los Angeles |
| Period | 6/24/24 → 6/26/24 |
UN SDGs
This output contributes to the following UN Sustainable Development Goals (SDGs)
-
SDG 3 Good Health and Well-being
Keywords
- ADNI
- Alzheimer’s Disease (AD)
- Data Imputation
- Extreme Gradient Boosting (XGBoost)
- Machine Learning
- Medical Diagnostics
- Missing Values
ASJC Scopus subject areas
- Human-Computer Interaction
- Computer Networks and Communications
- Computer Vision and Pattern Recognition
- Software
Fingerprint
Dive into the research topics of 'Enhancing Diagnostic Accuracy by Bypassing Traditional Imputation and Leveraging Missing Data in Alzheimer’s Disease Detection Models'. Together they form a unique fingerprint.Cite this
- APA
- Standard
- Harvard
- Vancouver
- Author
- BIBTEX
- RIS