(653a) Iterative Loop Structure Prediction Using Flexible and Fixed Stem Geometries
AIChE Annual Meeting
2010
2010 Annual Meeting
Food, Pharmaceutical & Bioengineering Division
Molecular Modeling of Biophysical Processes II - Protein Structure and Dynamics
Thursday, November 11, 2010 - 12:30pm to 12:50pm
Loop structure prediction is a very important intermediate step towards the tertiary structure prediction of proteins[1,2,3]. Loop structure prediction can be considered a mini ab initio structure prediction problem, since the loops themselves do not contain any fixed local bonding patterns which may help in reducing the search space in tertiary structure space. The flexible and fixed stem loop structure predictions differ in the availability of the knowledge of the geometry of the stem residues supporting the loop. The flexible stem geometry hence turns out to be very challenging, since only the identity of the stem secondary structure is known. We generated a large database of loop region backbone Phi and Psi angles, as well as pairwise distances between residues separated by three and four residues in the loop. The distance and dihedral angle databases were divided into bins of equal width. These distributions were used to generate initial backbone dihedral angles, and bounds for pairwise residue distances. Quick side chain rotamer optimization steps have been implemented, where a pairwise linear form of the van der waal's and hydrogen bonding terms of the ECEPP/3 force field is minimized [4]. The side chain rotamer optimization step first implements a polynomial-time version of the DEE algorithm, known as FASTER [5]. This is followed by random point mutations in rotamer values. Finally, a sweep across the chain is carried out, and all rotamers which cause backbone steric clashes are replaced with rotamers that do not cause them. The backbone is then subjected to local energy minimization, using the ECEPP/3 force field and the NPSOL local solver. Once a set of conformers are generated, these conformers are clustered using the novel iterative traveling salesman(TSP) based clustering algorithm, ICON [6]. ICON is an iterative algorithm which treats each conformer generated as a node in a TSP, and find the best path to connect these nodes. An integer optimization (ILP) based model is implemented to rigorously find cluster boundaries, and sparsely bound clusters are eliminated at each iteration. The top half of the densest clusters are used to generate improved and tighter bounds on the backbone dihedral angles and pair wise residue distances, which are used to generate initial structures at subsequent iterations of the algorithm. In order to ensure uniqueness of individual structures at each iteration, it is mandated that random initial structures generated have to differ from each other by at least one backbone dihedral angle by at least one bin. The approach has been applied on lengths of loops ranging from 5 residues to 14 residues, and very promising preliminary results have been obtained. While predicting loop structures with flexible stem geometries for the loops ranging from 5 to 14 residues, we analyzed 50 loops of each loop length, with each loop being between pairs of helices. The average of the lowest RMSD obtained for each of the loop lengths was 0.85A, 0.98A, 1.18A, 1.34A, 1.66A, 1.70A, 1.81A, 1.96A, 2.07A, 2.12A and 2.30A, respectively. It should be noted that the RMSD values were evaluated for the entire structure, including the three stem residues on either side. Computational results for an exhaustive set of over 700 loops will be reported for all loop lengths, which would encompass loops between helices, strands and any combination of the two secondary structures.
Bibliography
-----------------
[1] McAllister SR and Floudas CA (2010) An improved hybrid global optimization method for protein tertiary structure prediction, Comput. Optim. Appl. 45, 377-413
[2] Floudas CA, Fung HK, McAllister SR, Monnigmann M and Rajgaria R (2006) Advances in Protein Structure Prediction and De Novo Protein Design: A Review, Chem Engg. Sci., 61, 966-988
[3] Floudas CA (2007) Computational methods in protein structure prediction, Biotech. BioEng., 97, 207-213
[4] Lovell SC (2000) The penultimate rotamer library, Proteins, 40, 389-408
[5] Desmet J, Spriet J and Lasters I (2002) Fast and accurate Side-Chain Topology and Energy Refinement (FASTER) as a new method for protein structure optimization, Proteins, 48, 31-43.
[6] Subramani A, DiMaggio PA and Floudas CA (2009) Selecting high quality Protein structures from Diverse Conformational Ensembles, Biophys J, 97, 1728-1736