TY - GEN
T1 - Gene-disease association through topological and biological feature integration
AU - Hanna, Eileen Marie
AU - Zaki, Nazar M.
N1 - Publisher Copyright:
© 2015 IEEE.
PY - 2016/1/12
Y1 - 2016/1/12
N2 - The large amounts of biological information generated using advanced high-throughput experimental techniques continue to motivate the design of suitable methods for valuable knowledge mining. Finding proper means to examine and analyze such information allows better understanding of normal biological processes as well as uncovering malfunctions that trigger various diseases. Several computational approaches were developed to complement the experimental work which is often restricted by high time and cost requirements. In this paper, we consider the problem of disease- gene association and we propose a methodology based on a classification approach which integrates protein-protein interaction network topology features and biological information collected from various data sources. When applied on a dataset of multiple disease types and using the Naive Bayes classifier, our method achieves an AUC score of 0.941. We also consider two case studies of type II diabetes mellitus and breast cancer. The experimental results greatly favor our approach.
AB - The large amounts of biological information generated using advanced high-throughput experimental techniques continue to motivate the design of suitable methods for valuable knowledge mining. Finding proper means to examine and analyze such information allows better understanding of normal biological processes as well as uncovering malfunctions that trigger various diseases. Several computational approaches were developed to complement the experimental work which is often restricted by high time and cost requirements. In this paper, we consider the problem of disease- gene association and we propose a methodology based on a classification approach which integrates protein-protein interaction network topology features and biological information collected from various data sources. When applied on a dataset of multiple disease types and using the Naive Bayes classifier, our method achieves an AUC score of 0.941. We also consider two case studies of type II diabetes mellitus and breast cancer. The experimental results greatly favor our approach.
KW - biological features
KW - gene-disease association
KW - protein-protein interactions
KW - topological features
UR - http://www.scopus.com/inward/record.url?scp=84969883941&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84969883941&partnerID=8YFLogxK
U2 - 10.1109/INNOVATIONS.2015.7381544
DO - 10.1109/INNOVATIONS.2015.7381544
M3 - Conference contribution
AN - SCOPUS:84969883941
T3 - Proceedings - 2015 11th International Conference on Innovations in Information Technology, IIT 2015
SP - 225
EP - 229
BT - Proceedings - 2015 11th International Conference on Innovations in Information Technology, IIT 2015
A2 - Ismail, Leila
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 11th International Conference on Innovations in Information Technology, IIT 2015
Y2 - 1 November 2015 through 3 November 2015
ER -