Abstract
Protein chains are generally long and consist of multiple domains. Domains are distinct structural units of a protein that can evolve and function independently. The accurate and reliable prediction of protein domain linkers and boundaries is often considered to be the initial step of protein tertiary structure and function predictions. In this paper, we introduce CISA as a method for predicting inter-domain linker regions solely from the amino acid sequence information. The method first computes the amino acid compositional index from the protein sequence dataset of domain-linker segments and the amino acid composition. A preference profile is then generated by calculating the average compositional index values along the amino acid sequence using a sliding window. Finally, the protein sequence is segmented into intervals and a simulated annealing algorithm is employed to enhance the prediction by finding the optimal threshold value for each segment that separates domains from inter-domain linkers. The method was tested on two standard protein datasets and showed considerable improvement over the state-of-the-art domain linker prediction methods.
Original language | English |
---|---|
Pages (from-to) | 23-30 |
Number of pages | 8 |
Journal | Computational Biology and Chemistry |
Volume | 55 |
DOIs | |
Publication status | Published - Apr 2015 |
Keywords
- Amino acid composition
- Compositional index
- Domain linker prediction
- Simulated annealing
ASJC Scopus subject areas
- Structural Biology
- Biochemistry
- Organic Chemistry
- Computational Mathematics