TY - JOUR
T1 - AI-derived comparative assessment of the performance of pathogenicity prediction tools on missense variants of breast cancer genes
AU - Ahmad, Rahaf M.
AU - Ali, Bassam R.
AU - Al-Jasmi, Fatma
AU - Al Dhaheri, Noura
AU - Al Turki, Saeed
AU - Kizhakkedath, Praseetha
AU - Mohamad, Mohd Saberi
N1 - Publisher Copyright:
© The Author(s) 2024.
PY - 2024/12
Y1 - 2024/12
N2 - Single nucleotide variants (SNVs) can exert substantial and extremely variable impacts on various cellular functions, making accurate predictions of their consequences challenging, albeit crucial especially in clinical settings such as in oncology. Laboratory-based experimental methods for assessing these effects are time-consuming and often impractical, highlighting the importance of in-silico tools for variant impact prediction. However, the performance metrics of currently available tools on breast cancer missense variants from benchmarking databases have not been thoroughly investigated, creating a knowledge gap in the accurate prediction of pathogenicity. In this study, the benchmarking datasets ClinVar and HGMD were used to evaluate 21 Artificial Intelligence (AI)-derived in-silico tools. Missense variants in breast cancer genes were extracted from ClinVar and HGMD professional v2023.1. The HGMD dataset focused on pathogenic variants only, to ensure balance, benign variants for the same genes were included from the ClinVar database. Interestingly, our analysis of both datasets revealed variants across genes with varying penetrance levels like low and moderate in addition to high, reinforcing the value of disease-specific tools. The top-performing tools on ClinVar dataset identified were MutPred (Accuracy = 0.73), Meta-RNN (Accuracy = 0.72), ClinPred (Accuracy = 0.71), Meta-SVM, REVEL, and Fathmm-XF (Accuracy = 0.70). While on HGMD dataset they were ClinPred (Accuracy = 0.72), MetaRNN (Accuracy = 0.71), CADD (Accuracy = 0.69), Fathmm-MKL (Accuracy = 0.68), and Fathmm-XF (Accuracy = 0.67). These findings offer clinicians and researchers valuable insights for selecting, improving, and developing effective in-silico tools for breast cancer pathogenicity prediction. Bridging this knowledge gap contributes to advancing precision medicine and enhancing diagnostic and therapeutic approaches for breast cancer patients with potential implications for other conditions.
AB - Single nucleotide variants (SNVs) can exert substantial and extremely variable impacts on various cellular functions, making accurate predictions of their consequences challenging, albeit crucial especially in clinical settings such as in oncology. Laboratory-based experimental methods for assessing these effects are time-consuming and often impractical, highlighting the importance of in-silico tools for variant impact prediction. However, the performance metrics of currently available tools on breast cancer missense variants from benchmarking databases have not been thoroughly investigated, creating a knowledge gap in the accurate prediction of pathogenicity. In this study, the benchmarking datasets ClinVar and HGMD were used to evaluate 21 Artificial Intelligence (AI)-derived in-silico tools. Missense variants in breast cancer genes were extracted from ClinVar and HGMD professional v2023.1. The HGMD dataset focused on pathogenic variants only, to ensure balance, benign variants for the same genes were included from the ClinVar database. Interestingly, our analysis of both datasets revealed variants across genes with varying penetrance levels like low and moderate in addition to high, reinforcing the value of disease-specific tools. The top-performing tools on ClinVar dataset identified were MutPred (Accuracy = 0.73), Meta-RNN (Accuracy = 0.72), ClinPred (Accuracy = 0.71), Meta-SVM, REVEL, and Fathmm-XF (Accuracy = 0.70). While on HGMD dataset they were ClinPred (Accuracy = 0.72), MetaRNN (Accuracy = 0.71), CADD (Accuracy = 0.69), Fathmm-MKL (Accuracy = 0.68), and Fathmm-XF (Accuracy = 0.67). These findings offer clinicians and researchers valuable insights for selecting, improving, and developing effective in-silico tools for breast cancer pathogenicity prediction. Bridging this knowledge gap contributes to advancing precision medicine and enhancing diagnostic and therapeutic approaches for breast cancer patients with potential implications for other conditions.
KW - Artificial intelligence (AI)
KW - Breast cancer
KW - Health data science
KW - Pathogenicity prediction tools
KW - Single nucleotide variants (SNVs)
UR - http://www.scopus.com/inward/record.url?scp=85203421150&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85203421150&partnerID=8YFLogxK
U2 - 10.1186/s40246-024-00667-9
DO - 10.1186/s40246-024-00667-9
M3 - Article
C2 - 39256852
AN - SCOPUS:85203421150
SN - 1473-9542
VL - 18
JO - Human Genomics
JF - Human Genomics
IS - 1
M1 - 99
ER -