(584z) Robust In Silico Disease Classification Via Disease- and Procedure-Independent Optimization Models Using Quantitative MS1 Data From High-Throughput Proteomics

Conference

AIChE Annual Meeting

Year

2013

Proceeding

2013 AIChE Annual Meeting

Group

Food, Pharmaceutical & Bioengineering Division

Session

Poster Session: Engineering Fundamentals in Life Science

Time

Wednesday, November 6, 2013 - 6:00pm to 8:00pm

Authors

Guzman, Y. A. - Presenter, Princeton University

Floudas, C. A., Princeton University

Riley, C. P., Pathology Associates Medical Laboratories

A biomarker is a measurable characteristic that can indicate biological state relative to disease stage or contraction risk. Biomarkers have great potential to transform diagnostic medicine by serving as an early indicator or predictor of developing or oncoming disease. The monitoring of changes in the protein domain has the greatest potential to lead to the discovery of biomarkers due to the close relationship between protein expression and abundance and cellular state [1]. The need for rapid diagnostic turnover in time-sensitive disease systems has yielded an explosion of biomarker discovery research utilizing high-throughput mass spectrometry proteomics protocols within the past decade.

Efforts to discover and define robust protein biomarkers have yielded disappointing results. From 1997 to 2006, about 224,000 biomarker papers were published, while only 15 biomarkers were approved for use by the FDA [2]. The initial stage of biomarker discovery studies typically consists of a small amount of samples, and what seems to be a distinguishing protein biomarker often yields a high amount of false positives and false negatives at later validation stages with larger sample sizes and greater subject heterogeneity [1,3]. Focus has shifted to utilizing panels of biomarkers to create a set of diagnostic rules.

The complexity of biological samples and fluids makes the application of typical high-throughput proteomic analysis exceedingly difficult. The most enticing biofluidic source of protein biomarkers remains the blood and blood plasma for its clinical accessibility, but they display a dynamic concentration range spanning up to 11 orders of magnitude, with 99% of protein mass coming from 22 blood proteins [4-6]. Typical data-dependent acquisition for MS/MS fragmentation will exclude low-abundance proteins whose up- and down-regulation may capture disease response and treatment progression, and its stochastic nature reduces run-to-run reproducibility. These difficulties limit the penetrating depth of untargeted MS/MS protein identification protocols. In response, many biomarker discovery studies have focused on classification using MS¹ features, resulting in very high sensitivity and specificity; these studies have also elicited criticism [7-9], as statistical methods, machine learning techiques, and black-box models are prone to over-training and can magnify features that are actually data artifacts [9,10].

Building on a previous study in which mixed-integer linear optimization models were proposed to classify healthy and diseased samples [11,12], we propose a novel class of robust optimization models that can fingerprint and classify healthy and diseased samples given quantitative MS¹ data. These models can simultaneously select the optimum subset of distinguishing MS¹ peaks while performing parameter estimation. The resulting functions are of diagnostic utility, quantitatively classifying new blind samples given only MS¹ data. The new classification models are general and independent of sample biofluid, experimental protocol, and disease system. The optimal peak subset yields a multiple reaction monitoring protein identification protocol for further sample characterization and biomarker investigation. Results from the proposed models are presented as applied to MS¹ data of proteomics samples collected from different biofluids, subjected to different experimental protocols, and relating to vastly different disease systems, including plasma samples collected from breast cancer patients and gingival crevicular fluid samples collected from patients with chronic periodontitis [13,14].

[1] Rifai N., Gillette M.A., Carr S.A. Nature Biotechnology, 24(8):971-983, 2006.
[2] Jin G., Zhou X., Wang H., Wong S.T.C. The Challenges in Blood Proteomic Biomarker Discovery. In Pham T., Computational Biology: Issues and Applications in Oncology. New York: Springer, 2009.
[3] Srinivas P.R., Verma M., Zhao Y., Srivastava S. Clinical Chemistry, 48(8):1160-1169, 2002.
[4] Anderson N.L., Anderson N.G. Molecular & Cellular Proteomics, 1(11):845-867, 2002.
[5] Schiess R., Wollscheid B., Aebersold R. Molecular Oncology, 3(1):33-44, 2009.
[6] Veenstra T.D., Conrads T.P., Hood B.L., Avellino A.M., Ellenbogen R.G., Morrison R.S. Molecular & Cellular Proteomics, 4(4):409-418, 2005.
[7] Sorace J.M., Zhan M. BMC Bioinformatics, 4:24, 2003.
[8] Poste G. Nature, 469(7329):156-157, 2011.
[9] Rogers M.A., Clarke P., Noble J., Munro N.P., Paul A., Selby P.J., Banks R.E. Cancer Research, 63(20):6971-6983, 2003.
[10] He Z., Yu W. Computational Biology and Chemistry, 34(4):215-225, 2010.
[11] Baliban R.C., Sakellari D., Li Z., Guzman Y.A., Garcia B.A., Floudas C.A. Journal of Clinical Periodontology, 40(2):131-139, 2013.
[12] Baliban R.C., Dimaggio P.A., Plazas-Mayorca M.D., Garcia B.A., Floudas C.A. Journal of Proteome Research, 11(9):4615-4629, 2012.
[13] Riley C.P., Zhang X., Nakshatri H., Schneider B., Regnier F.E., Adamec J., Buck C. Journal of Translational Medicine, 9:80, 2011.
[14] Baliban R.C., Sakellari D., Li Z., DiMaggio P.A., Garcia B.A., Floudas C.A. Journal of Clinical Periodontology, 39(3):203-212, 2012.

Other Sites & Tools

Technical Groups

Technical

Professional/Personal Growth

Societal Needs

Leadership

2025 Spring Meeting and 21st Global Congress on Process Safety

2025 AIChE Annual Meeting

Upcoming Conferences & Events

CEP: January 2025

CEP: December 2024

Explore Areas of Advancement:

Learning Center:

Want to be an Entrepreneur? Personal Stories From Three Successful Entrepreneurs Who Have Traveled This Path.

(584z) Robust In Silico Disease Classification Via Disease- and Procedure-Independent Optimization Models Using Quantitative MS1 Data From High-Throughput Proteomics

AIChE Annual Meeting

2013

2013 AIChE Annual Meeting

Food, Pharmaceutical & Bioengineering Division

Poster Session: Engineering Fundamentals in Life Science

Wednesday, November 6, 2013 - 6:00pm to 8:00pm

Authors

More Conference Links

Cancelation Policy

Code of Conduct

Beware of Hotel and Attendee-list Scams