TY - JOUR
T1 - Regularized ensemble learning for prediction and risk factors assessment of students at risk in the post-COVID era
AU - Khan, Zardad
AU - Ali, Amjad
AU - Khan, Dost Muhammad
AU - Aldahmani, Saeed
N1 - Publisher Copyright:
© The Author(s) 2024.
PY - 2024/12
Y1 - 2024/12
N2 - The COVID-19 pandemic has had a significant impact on students’ academic performance. The effects of the pandemic have varied among students, but some general trends have emerged. One of the primary challenges for students during the pandemic has been the disruption of their study habits. Students getting used to online learning routines might find it even more challenging to perform well in face to face learning. Therefore, assessing various potential risk factors associated with students low performance and its prediction is important for early intervention. As students’ performance data encompass diverse behaviors, standard machine learning methods find it hard to get useful insights for beneficial practical decision making and early interventions. Therefore, this research explores regularized ensemble learning methods for effectively analyzing students’ performance data and reaching valid conclusions. To this end, three pruning strategies are implemented for the random forest method. These methods are based on out-of-bag sampling, sub-sampling and sub-bagging. The pruning strategies discard trees that are adversely affected by the unusual patterns in the students data forming forests of accurate and diverse trees. The methods are illustrated on an example data collected from university students currently studying on campus in a face-to-face modality, who studied during the COVID-19 pandemic through online learning. The suggested methods outperform all the other methods considered in this paper for predicting students at the risk of academic failure. Moreover, various factors such as class attendance, students interaction, internet connectivity, pre-requisite course(s) during the restrictions, etc., are identified as the most significant features.
AB - The COVID-19 pandemic has had a significant impact on students’ academic performance. The effects of the pandemic have varied among students, but some general trends have emerged. One of the primary challenges for students during the pandemic has been the disruption of their study habits. Students getting used to online learning routines might find it even more challenging to perform well in face to face learning. Therefore, assessing various potential risk factors associated with students low performance and its prediction is important for early intervention. As students’ performance data encompass diverse behaviors, standard machine learning methods find it hard to get useful insights for beneficial practical decision making and early interventions. Therefore, this research explores regularized ensemble learning methods for effectively analyzing students’ performance data and reaching valid conclusions. To this end, three pruning strategies are implemented for the random forest method. These methods are based on out-of-bag sampling, sub-sampling and sub-bagging. The pruning strategies discard trees that are adversely affected by the unusual patterns in the students data forming forests of accurate and diverse trees. The methods are illustrated on an example data collected from university students currently studying on campus in a face-to-face modality, who studied during the COVID-19 pandemic through online learning. The suggested methods outperform all the other methods considered in this paper for predicting students at the risk of academic failure. Moreover, various factors such as class attendance, students interaction, internet connectivity, pre-requisite course(s) during the restrictions, etc., are identified as the most significant features.
KW - COVID-19
KW - Classification
KW - Ensemble learning
KW - Face-to-face learning
KW - Students at risk
UR - http://www.scopus.com/inward/record.url?scp=85198369813&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85198369813&partnerID=8YFLogxK
U2 - 10.1038/s41598-024-66894-1
DO - 10.1038/s41598-024-66894-1
M3 - Article
C2 - 39003293
AN - SCOPUS:85198369813
SN - 2045-2322
VL - 14
JO - Scientific reports
JF - Scientific reports
IS - 1
M1 - 16200
ER -