Utilizing cost-sensitive machine learning classifiers to identify compounds that inhibit Alzheimer's APP translation

Hany Alashwal, Juwayni Lucman

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Virtual screening of bioassay data can be of immense benefit to identify compounds which can assist in restricting the production of amyloid beta peptides (Aβ), observed in Alzheimer patients, by inhibiting the translation of amyloid precursor protein (APP). Machine learning classifiers can be adopted on the dataset to investigate those compounds. The ratio of the active molecules that achieve the goal of inhibiting APP, nonetheless, is minimal compared to their inactive counterparts. The imbalance between the two classes is handled by introducing cost-sensitivity to reweight the training instances depending on the misclassification cost allotted to each class. The paper shows the performance of cost-sensitive classifiers (Random Forest, Naive Bayes, and Logistic Regression classifier) to spot the minority (active) molecules from the majority (inactive) classes and shows their evaluation metrics. Sensitivity, specificity, False Negative rate, ROC area, and accuracy are evaluated while keeping the False Positive rate at 20.6%. The aim of the study is to investigate the most reliable classifier for the bioassay data and to explore the ideal misclassification cost. Random Forest classifier was the most robust model compared to Naive Bayes and Logistic Regression Classifiers. Moreover, each classifier had a different optimal misclassification cost.

Original languageEnglish
Title of host publicationProceedings of the 2020 4th International Conference on Cloud and Big Data Computing, ICCBDC 2020
PublisherAssociation for Computing Machinery
Pages113-117
Number of pages5
ISBN (Electronic)9781450375382
DOIs
Publication statusPublished - Aug 26 2020
Event4th International Conference on Cloud and Big Data Computing, ICCBDC 2020 - Virtual, Online, United Kingdom
Duration: Aug 26 2020Aug 28 2020

Publication series

NameACM International Conference Proceeding Series

Conference

Conference4th International Conference on Cloud and Big Data Computing, ICCBDC 2020
Country/TerritoryUnited Kingdom
CityVirtual, Online
Period8/26/208/28/20

Keywords

  • Alzheimer's Disease
  • Classification
  • Cost Sensitivity
  • Logistic Regression
  • Naive Bayes
  • Primary Screen Bioassay
  • Random Forest

ASJC Scopus subject areas

  • Software
  • Human-Computer Interaction
  • Computer Vision and Pattern Recognition
  • Computer Networks and Communications

Fingerprint

Dive into the research topics of 'Utilizing cost-sensitive machine learning classifiers to identify compounds that inhibit Alzheimer's APP translation'. Together they form a unique fingerprint.

Cite this