TY - JOUR
T1 - Feature-ranking-based ensemble classifiers for survivability prediction of intensive care unit patients using lab test data
AU - Alam, Md Zahangir
AU - Masud, Mohammad M.
AU - Rahman, M. Saifur
AU - Cheratta, Muhsin
AU - Nayeem, Muhammad Ali
AU - Rahman, M. Sohel
N1 - Funding Information:
The first author is supported by an ICT Fellowship administered by ICT Division, Government of the People's Republic of Bangladesh, website: https://ictd.gov.bd/ . This work was also supported in part by Abu Dhabi Department of Education and Knowledge (ADEK) Award for Research Excellence (AARE) 2017 Award No: AARE17-182 .
Funding Information:
Research into clinical decision support (CDS) and clinical prediction (CP) has been receiving increasing research attention in recent years owing to the significant improvements they have brought about in the quality, safety, efficiency, and effectiveness of healthcare. Intensive care unit (ICU) patients require extensive care and monitoring. Efficient and effective actions recommended via CDS or CP can help caregivers take necessary actions to avoid unwanted circumstances or improve patient health. Therefore, studies in this domain have been gaining increasing research attention over the last decade.Calvert et al. developed the AutoTriage algorithm that uses eight common clinical variables (mostly representing physiological measurements) and two or three discretized variables [9]. These clinical variables produce subscores, and a combination of weighted subscores is used as the final score for the mortality prediction of ICU patients. They conducted experiments on Medical Information Mart for Intensive Care III (MIMIC-III) hospital database of ICU patients. Bhattacharya et al. conducted a study on ICU mortality prediction that addressed the class imbalance issue in ICU data in the binary classification context using a feature transformation approach [11]. They used demographic data, 37 lab investigations, and some physiological signal measurements in their experiments. Xie et al. outlined the essential procedures and concepts for developing prediction models for in-hospital mortality prediction of ICU patients [12]. Therein, an artificial neural network (ANN), decision tree (DT), and support vector machines (SVM) demonstrated promising results relative to analyzing large and heterogeneous data compared with logistic regression (LR). They used physiological and other types of variables, such as age, comorbidity, and admission type. Awad et al. proposed the early mortality prediction for ICU patients (EMPICU) method [13] to predict mortality 6 h after admission to the ICU. They included demographic, physiological, vital sign, and laboratory test variables collected from the MIMIC-II database. Nguyen et al. proposed a deep learning architecture based on long short-term memory (LSTM) networks with a layered attention mechanism to predict ICU mortality to address the issue of missing measurements [14], wherein they employed 41 different measurements (including vital signs).The datasets used herein are extracted from the MIMIC-II (Physionet MIMICII) [36] and MIMIC-III (Physionet-MIMICIII) [37] databases. These databases were developed by the Multiparameter Intelligent Monitoring in Intensive Care (MIMIC-II) project, and Medical Information Mart for Intensive Care (MIMIC-III) project, respectively, at the Laboratory of Computational Physiology at MIT, which is funded by the National Institute of Biomedical Imaging and Bioengineering. Each database contains the results of lab tests produced by each test performed on each patient. Each lab test result contains the numeric value of the performed lab test, a flag (a binary value indicating whether the result is ?normal? or ?abnormal?), the measurement unit, and the date and time of the test.The first author is supported by an ICT Fellowship administered by ICT Division, Government of the People's Republic of Bangladesh, website: https://ictd.gov.bd/. This work was also supported in part by Abu Dhabi Department of Education and Knowledge (ADEK) Award for Research Excellence (AARE) 2017 Award No: AARE17-182.
Publisher Copyright:
© 2020 The Author(s)
PY - 2021/1
Y1 - 2021/1
N2 - Clinical decision support systems (CDSSs) have received increasing research attention in recent years because they can improve the quality, safety, efficiency, and effectiveness of healthcare. A CDSS combined with advanced data analytics is more accurate and efficient than traditional systems. In this domain, survival or deterioration prediction of critical care patients, e.g., intensive care unit (ICU) patients, is an active research area. Early deterioration prediction can help healthcare providers in providing efficient and effective patient care. Research in this field is primarily based on vital signs. However, very few studies have investigated survival prediction using lab test data. Although some studies have made advancements in this field, accuracy remains insufficient. Thus, this study aims to improve the accuracy and efficiency of survival prediction for ICU patients. We propose a feature-ranking-based ensemble of classifiers for survival prediction of ICU patients using only lab test data. In the proposed method, features are evaluated first, and subsets of useful features are selected. Subsequently, training data with the selected features are clustered using a feature vector compaction (FVC) technique. Finally, ensemble classifier models are trained. Extensive experiments with over 3000 different settings on six ICU patient datasets were performed to evaluate the efficacy of the proposed method. The proposed technique achieves weighted average F1 score (Fwa) as high as 82.6% with support vector machine classifier when feature ranking is used with a combination of vertical and horizontal grouping-based FVC. All experimental results demonstrate that this technique outperforms existing methods, with the Fwa score difference being as high as 4.5%.
AB - Clinical decision support systems (CDSSs) have received increasing research attention in recent years because they can improve the quality, safety, efficiency, and effectiveness of healthcare. A CDSS combined with advanced data analytics is more accurate and efficient than traditional systems. In this domain, survival or deterioration prediction of critical care patients, e.g., intensive care unit (ICU) patients, is an active research area. Early deterioration prediction can help healthcare providers in providing efficient and effective patient care. Research in this field is primarily based on vital signs. However, very few studies have investigated survival prediction using lab test data. Although some studies have made advancements in this field, accuracy remains insufficient. Thus, this study aims to improve the accuracy and efficiency of survival prediction for ICU patients. We propose a feature-ranking-based ensemble of classifiers for survival prediction of ICU patients using only lab test data. In the proposed method, features are evaluated first, and subsets of useful features are selected. Subsequently, training data with the selected features are clustered using a feature vector compaction (FVC) technique. Finally, ensemble classifier models are trained. Extensive experiments with over 3000 different settings on six ICU patient datasets were performed to evaluate the efficacy of the proposed method. The proposed technique achieves weighted average F1 score (Fwa) as high as 82.6% with support vector machine classifier when feature ranking is used with a combination of vertical and horizontal grouping-based FVC. All experimental results demonstrate that this technique outperforms existing methods, with the Fwa score difference being as high as 4.5%.
KW - Clinical prediction
KW - Clustering
KW - Feature grouping
KW - Feature vector compaction
KW - ICU Patients
KW - Lab test data
UR - http://www.scopus.com/inward/record.url?scp=85098759040&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85098759040&partnerID=8YFLogxK
U2 - 10.1016/j.imu.2020.100495
DO - 10.1016/j.imu.2020.100495
M3 - Article
AN - SCOPUS:85098759040
SN - 2352-9148
VL - 22
JO - Informatics in Medicine Unlocked
JF - Informatics in Medicine Unlocked
M1 - 100495
ER -