TY - GEN
T1 - Prediction of protein inter-domain linkers using compositional index and simulated annealing
AU - Shatnawi, Maad
AU - Zaki, Nazar
PY - 2013
Y1 - 2013
N2 - Protein chains are typically large and consist of multiple domains which are difficult and computationally expensive to characterize using experimental methods. Therefore, accurate and reliable prediction of protein domain boundaries is often the initial step in both experimental and computational protein research. In this paper, we propose a straightforward yet effective method to predict inter-domain linker segments by using the amino acid compositional index from the amino acid sequence information. Each amino acid in the protein sequence is represented by a compositional index which is deduced from a combination of the difference in amino acid occurrences in domains and linker segments in training protein sequences and the amino acid composition information. Further, we employ simulated annealing to improve the prediction by finding the optimal set of threshold values that separate domains from inter-domain linkers. The performance of the proposed method is compared to the current approaches on two protein sequence datasets. Experimental results show superior performance by the proposed method when compared to the state-of-the-art methods for inter-domain linker prediction.
AB - Protein chains are typically large and consist of multiple domains which are difficult and computationally expensive to characterize using experimental methods. Therefore, accurate and reliable prediction of protein domain boundaries is often the initial step in both experimental and computational protein research. In this paper, we propose a straightforward yet effective method to predict inter-domain linker segments by using the amino acid compositional index from the amino acid sequence information. Each amino acid in the protein sequence is represented by a compositional index which is deduced from a combination of the difference in amino acid occurrences in domains and linker segments in training protein sequences and the amino acid composition information. Further, we employ simulated annealing to improve the prediction by finding the optimal set of threshold values that separate domains from inter-domain linkers. The performance of the proposed method is compared to the current approaches on two protein sequence datasets. Experimental results show superior performance by the proposed method when compared to the state-of-the-art methods for inter-domain linker prediction.
KW - Amino acid composition
KW - Compositional index
KW - Domain boundary prediction
KW - Domain linkers
KW - Inter-domain linker segments
KW - Protein sequences
KW - Simulated annealing
UR - http://www.scopus.com/inward/record.url?scp=84882428715&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84882428715&partnerID=8YFLogxK
U2 - 10.1145/2464576.2482740
DO - 10.1145/2464576.2482740
M3 - Conference contribution
AN - SCOPUS:84882428715
SN - 9781450319645
T3 - GECCO 2013 - Proceedings of the 2013 Genetic and Evolutionary Computation Conference Companion
SP - 1603
EP - 1608
BT - GECCO 2013 - Proceedings of the 2013 Genetic and Evolutionary Computation Conference Companion
T2 - 15th Annual Conference on Genetic and Evolutionary Computation, GECCO 2013
Y2 - 6 July 2013 through 10 July 2013
ER -