(119f) De Novo Protein Design of a Conformational Switch Between α-Helical Protein and Mixed α/β Protein
AIChE Annual Meeting
2010
2010 Annual Meeting
Food, Pharmaceutical & Bioengineering Division
Protein Engineering II - Techniques
Monday, November 8, 2010 - 2:20pm to 2:40pm
The aim of de novo protein design is to find the amino acid sequence (or sequences) that will fold into a desired 3-dimensional structure. Experimental methods such as directed evolution have shown success in protein design. However, such methods are restricted by the limited sequence search space (~ 103 ? 106). Computational design strategies can search over much larger sequence spaces. For this reason computational efforts in protein design have gained popularity.
In order to reduce the complexity of the problem it is important to know which amino acids contain the most information toward fold specification and how changes in the sequence could influence the structure. Often it is the case that high sequence identity between two proteins means that the two will share fold specificity. However, the success of the Paracelsus Challenge [1] and the recent development of T498 and T499[2] have shown that this is not always the case. T498 is a 56-residue protein that folds into a 3-α fold and T499 is a 56-residue protein that folds into one α-helix and four β-strands. Despite these structural differences the two proteins differ by only three point mutations. This type of conformational switch poses a major question in protein design: what is the minimum number of mutations necessary to induce a conformational switch in a protein?
We formulated the question as a mixed-integer linear programming (MILP) model to minimize the number of mutations to move from the T498 fold (3-α) to the T499 fold (mixed α/β). The mutant sequences are constrained to have a higher energy in the T498 fold than the T498 native sequence and to have a lower energy in the T499 fold than the T499 native sequence. In this way, the mutant sequences are more energetically favorable in the mixed α/β fold than in the 3-α fold.
This MILP model was implemented using the two-stage protein design method developed by Klepeis et al. [3] and enhanced by Fung et al. [4, 5, 6, 7]. In the first stage, hundreds of sequences were generated that have one to three mutations from the native T498 sequence yet are predicted to adopt the T499 fold. In the second stage, the folds of the sequences are computationally validated using a number of metrics including fold specificity [8], homology modeling [9], first principles structure prediction [10], and molecular dynamics [11]. In addition, various sequence entropy calculations were performed to assess residue positions that were preferentially conserved (low entropy). These low entropy sites may be functionally and/or structurally important, as they accepted fewer mutations relative to the rest of the sequence.
[1] Rose GD (1997) Protein Folding and the Paracelsus Challenge. Nature Structural Biology 4:512-514.
[2] He Y, Chen Y, Alexander P, Bryan PN, Orban J (2008) NMR structures of two designed proteins with high sequence identity but different fold and function. Proceedings of the National Acedemy of Sciences USA 105:14412-14417.
[3] Klepeis JL, Floudas CA, Morikis D, Tsokos CG, Lambris JD (2004) Design of Peptide Analogs with Improved Activity Using a Novel De Novo Protein Design Approach. Industrial & Engineering Chemistry Research 43:3817-3826.
[4] Fung HK, Rao S, Floudas CA, Prokopyev O, Paradalos, PM, Rendl F (2005). Computational Comparison Studies of Quadratic Assignment Like Formulations for the In Silico Sequence Selection Problem in De Novo Protein Design. Journal of Combinatorial Optimization, 10:41-60.
[5] Fung HK, Taylor, MS, Floudas, CA. (2007). Novel formulations for the sequence selection problem in de novo protein design with flexible templates. Optimization Methods and Software, 22:51-71.
[6] Fung HK, Floudas CA, Taylor MS, Zhang L, and Morikis D (2008) Toward Full-Sequence De Novo Protein Design with Flexible Templates for Human Beta-Defensin-2. Biophysical Journal 94:584-599.
[7] Fung HK, Welsh WJ, Floudas CA (2008). Computational De Novo Peptide and Protein Design: Rigid Templates versus Flexible Templates. Ind. Eng. Chem. Res., 47:993-1001.
[8] Vorobjev YN, Vila JA, Scheraga HA (2008). FAMBE-pH: A Fast and Accurate Method to Compute the Total Solvation Free Energies of Proteins. J. Phys. Chem. B, 112: 11122-11136.
[9] Wu S, Zhang Y (2007) LOMETS: A local meta-threading-server for protein structure prediction. Nucleic Acids Research 35:3375-3382.
[10] Klepeis JL, Floudas CA (2003) ASTRO-FOLD: A combinatorial and global optimization framework for ab initio prediction of three-dimensional structures of proteins from the amino acid sequence. Biophys. J. 85:2119-2146.
[11] Lindahl E, Hess B, van der Spoel D (2001) GROMACS 30: a package for molecular simulation and trajectory analysis. Journal of Molecular Modeling 7:306-317.