TY - JOUR
T1 - Predicting the multiple parameters of organic acceptors through machine learning using RDkit descriptors
T2 - An easy and fast pipeline
AU - Katubi, Khadijah Mohammedsaleh
AU - Saqib, Muhammad
AU - Mubashir, Tayyaba
AU - Tahir, Mudassir Hussain
AU - Halawa, Mohamed Ibrahim
AU - Akbar, Alveena
AU - Basha, Beriham
AU - Sulaman, Muhammad
AU - Alrowaili, Z. A.
AU - Al-Buriahi, M. S.
N1 - Publisher Copyright:
© 2023 Wiley Periodicals LLC.
PY - 2023/12/5
Y1 - 2023/12/5
N2 - Machine learning (ML) analysis has gained huge importance among researchers for predicting multiple parameters and designing efficient donor and acceptor materials without experimentation. Data are collected from literature and subsequently used for predicting impactful properties of organic solar cells such as power conversion efficiency (PCE) and energy levels (HOMO/LUMO). Importantly, out of various tested models, hist gradient boosting (HGB) and the light gradient boosting (LGBM) regression models revealed better predictive capabilities. To achieve the prediction effectively, the selected (best) ML regression models are further tuned. For the prediction of PCE (test set), the LGBM shows the coefficient of determination (R2) value of 0.787, which is higher than HGB (R2 = 0.680). For the prediction of HOMO (test set), the LGBM shows R2 value of 0.566, which is higher than HGB (R2 = 0.563). However, for the prediction of LUMO (test set), the LGBM shows R2 value of 0.605, which is lower than HGB (R2 = 0.606). Among the three predicted properties, prediction ability is higher for PCE. These models help to predict the efficient acceptors in a short time and less computational cost.
AB - Machine learning (ML) analysis has gained huge importance among researchers for predicting multiple parameters and designing efficient donor and acceptor materials without experimentation. Data are collected from literature and subsequently used for predicting impactful properties of organic solar cells such as power conversion efficiency (PCE) and energy levels (HOMO/LUMO). Importantly, out of various tested models, hist gradient boosting (HGB) and the light gradient boosting (LGBM) regression models revealed better predictive capabilities. To achieve the prediction effectively, the selected (best) ML regression models are further tuned. For the prediction of PCE (test set), the LGBM shows the coefficient of determination (R2) value of 0.787, which is higher than HGB (R2 = 0.680). For the prediction of HOMO (test set), the LGBM shows R2 value of 0.566, which is higher than HGB (R2 = 0.563). However, for the prediction of LUMO (test set), the LGBM shows R2 value of 0.605, which is lower than HGB (R2 = 0.606). Among the three predicted properties, prediction ability is higher for PCE. These models help to predict the efficient acceptors in a short time and less computational cost.
KW - RDkit
KW - hist gradient boosting regression model
KW - light gradient boosting regression model
KW - machine learning
KW - organic acceptors
UR - https://www.scopus.com/pages/publications/85169125286
UR - https://www.scopus.com/pages/publications/85169125286#tab=citedBy
U2 - 10.1002/qua.27230
DO - 10.1002/qua.27230
M3 - Article
AN - SCOPUS:85169125286
SN - 0020-7608
VL - 123
JO - International Journal of Quantum Chemistry
JF - International Journal of Quantum Chemistry
IS - 23
M1 - e27230
ER -