TY - GEN
T1 - Detecting protein complexes from noisy protein interaction data
AU - Efimov, Dmitry
AU - Zaki, Nazar
AU - Berengueres, Jose
PY - 2012
Y1 - 2012
N2 - High-throughput experimental techniques have made available large datasets of experimentally detected protein-protein interactions. However, experimentally determined protein complexes datasets are not exhaustive nor reliable. A protein complex plays a key role in disease development. Therefore, the identification and characterization of protein complexes involved is crucial to the understanding of the molecular events under normal and abnormal physiological conditions. In this paper, we propose a novel graph mining algorithm to identify protein complexes. The algorithm first checks the quality of the interaction data, then predicts protein complexes based on the concept of weighted clustering coefficient. To demonstrate the effectiveness of our proposed method, we present experimental results on yeast protein interaction data. The level of accuracy achieved is a strong argument in favor of the proposed method. Novel protein complexes were also predicted to assist biologists in their search for protein complexes. The datasets and programs are freely available from http://faculty.uaeu.ac.ae/nzaki/PE-WCC. htm.
AB - High-throughput experimental techniques have made available large datasets of experimentally detected protein-protein interactions. However, experimentally determined protein complexes datasets are not exhaustive nor reliable. A protein complex plays a key role in disease development. Therefore, the identification and characterization of protein complexes involved is crucial to the understanding of the molecular events under normal and abnormal physiological conditions. In this paper, we propose a novel graph mining algorithm to identify protein complexes. The algorithm first checks the quality of the interaction data, then predicts protein complexes based on the concept of weighted clustering coefficient. To demonstrate the effectiveness of our proposed method, we present experimental results on yeast protein interaction data. The level of accuracy achieved is a strong argument in favor of the proposed method. Novel protein complexes were also predicted to assist biologists in their search for protein complexes. The datasets and programs are freely available from http://faculty.uaeu.ac.ae/nzaki/PE-WCC. htm.
KW - Clustering coefficient
KW - Detecting protein complexes
KW - Interaction reliability
KW - Protein complex
KW - Protein-protein interaction network
UR - http://www.scopus.com/inward/record.url?scp=84866603595&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84866603595&partnerID=8YFLogxK
U2 - 10.1145/2350176.2350177
DO - 10.1145/2350176.2350177
M3 - Conference contribution
AN - SCOPUS:84866603595
SN - 9781450315524
T3 - Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
SP - 1
EP - 7
BT - Proc. of the 11th Int. Workshop on Data Mining in Bioinformatics, BIOKDD 2012 - Held in Conjunction with the 18th ACM SIGKDD Int. Conference on Knowledge Discovery and Data Mining, SIGKDD'12
T2 - 11th International Workshop on Data Mining in Bioinformatics, BIOKDD 2012 - Held in Conjunction with the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, SIGKDD'12
Y2 - 12 August 2012 through 12 August 2012
ER -