(424c) PILOT_PROTEIN: A High-Throughput Method for In Silico Discovery of Peptides, Proteins, and Post-Translational Modifications

Conference

AIChE Annual Meeting

Year

2011

Proceeding

2011 Annual Meeting

Group

2011 Annual Meeting of the American Electrophoresis Society (AES)

Session

Advances In Electrophoretic Protein Separation and Analysis

Time

Wednesday, October 19, 2011 - 9:12am to 9:33am

Authors

Baliban, R. - Presenter, Princeton University

DiMaggio, P. A. Jr. - Presenter, Princeton University

Li, Z. - Presenter, Princeton University

Plazas-Mayorca, M. - Presenter, Princeton Universirty

Garcia, B. A. - Presenter, Princeton University

Floudas, C. A. - Presenter, Princeton University

A fundamental problem in proteomics is that of modified protein identification, which corresponds to determining sequence of a protein along with all amino acid modifications that were added to the protein after it was constructed in vivo (post-translational modifications or PTMs). We present a comprehensive set of tools that can address this problem using mixed-integer linear optimization (MILP) and tandem mass spectrometry. These tools have been integrated into a singular webtool that is freely available to the scientific community.

MS/MS spectra are initially analyzed using the de novo sequencing algorithm PILOT [1]. PILOT can rigorously guarantee a rank-ordered list of optimal candidate sequences without complete enumeration of all possible sequences. To utilize the strengths of a database routine, a hybrid de novo/database routine using local sequence alignment, PILOT_SEQUEL [2], was developed to input the results of the PILOT method and query these sequences against a database to find peptide matches.

A comprehensive unmodified protein list is constructed using the novel method PILOT_PROTEIN [3]. PILOT_PROTEIN will use the scored peptides of PILOT_SEQUEL to generate the protein list and incorporates a peptide clustering routine to reduce false positives. Using a known list of proteins, either an untargeted or a targeted PTM search can be performed to determine the set of PTM types and sites that best explains the experimental data [4]. If the proteins are known to be highly modified, a MILP approach may be used to determine the identification and quantification of all proteins that may exist within each MS/MS. The resulting output from the above routines is a complete modified protein list.

To verify the protein prediction capability of each method, several comparative studies were performed against state-of-the-art algorithms using data sets from a variety of MS/MS instruments and fragmentation types. Specifically, the peptide prediction accuracy of PILOT and PILOT_SEQUEL, the protein identification accuracy of PILOT_PROTEIN, and the PTM prediction accuracy of PILOT_PTM were analyzed using over 170 LC-MS/MS data sets from the Standard Protein Mix Database [5] comprising a total of 1.5 million MS/MS. Each of the methods produces superior predictive capability when compared to other algorithms and maintains these results across all of the test data sets.

[1] P. A. DiMaggio Jr. and C. A. Floudas. De novo peptide identification via tandem mass spectrometry and integer linear. Anal. Chem., 79:1433-1446, 2007.

[2] P. A. DiMaggio Jr., B. Lu, J. R. Yates III, and C. A. Floudas. A Hybrid Method for Peptide Identification Using Integer Linear Optimization, Local Database Search, and Quadrupole Time-of-Flight or OrbiTrap Tandem Mass Spectrometry. J. Proteome Res., 7(4):1584–1593, 2008.

[3] R. C. Baliban, P. A. DiMaggio Jr., ZuKui Li., M. D. Plazas-Mayorca, B. A Garcia, and C. A. Floudas. Identification of modified and unmodified proteins via high-resolution tandem mass spectrometry and mixed-integer linear optimization. Mol. Cell Proteomics, submitted.

[4] R. C. Baliban, P. A. DiMaggio Jr., M. D. Plazas-Mayorca, N. L. Young, B. A Garcia, and C. A. Floudas. A Novel Method for Untargeted Post-Translational Modification Identification Using Integer Linear Optimization and Tandem Mass Spectrometry. Mol. Cell Proteomics, 9:764-779, 2010.

[5] J. Klimer, J. S. Eddes, L. Hohmann, J. Jackson, A. Peterson, S. Letarte, P R. Gafken, J. E. Katz, P. Mallick, H. Lee, A. Schmidt, R. Ossola, J. K. Eng, R. Aebersold, and D. B. Martin. The Standard Protein Mix Database: A Diverse Data Set To Assist in the Production of Improved Peptide and Protein Identification Software Tools. J. Proteome Res., 7(1):96–103, 2008.

Other Sites & Tools

Technical Groups

Technical

Professional/Personal Growth

Societal Needs

Leadership

2025 Spring Meeting and 21st Global Congress on Process Safety

2025 AIChE Annual Meeting

Upcoming Conferences & Events

CEP: January 2025

CEP: December 2024

Explore Areas of Advancement:

Learning Center:

Want to be an Entrepreneur? Personal Stories From Three Successful Entrepreneurs Who Have Traveled This Path.

(424c) PILOT_PROTEIN: A High-Throughput Method for In Silico Discovery of Peptides, Proteins, and Post-Translational Modifications

AIChE Annual Meeting

2011

2011 Annual Meeting

2011 Annual Meeting of the American Electrophoresis Society (AES)

Advances In Electrophoretic Protein Separation and Analysis

Wednesday, October 19, 2011 - 9:12am to 9:33am

Authors

More Conference Links

Cancelation Policy

Code of Conduct

Beware of Hotel and Attendee-list Scams