(594f) Enhancing Compound-Protein Interaction Prediction with Confidence Assessment

Conference

AIChE Annual Meeting

Year

2024

Proceeding

2024 AIChE Annual Meeting

Group

Food, Pharmaceutical & Bioengineering Division

Session

Machine Learning Based Protein Engineering

Time

Wednesday, October 30, 2024 - 5:22pm to 5:40pm

Authors

Chinas Serrano, L. - Presenter, University of Toronto

A. Barghout, R.

Xu, Z., University of Toronto

Mahadevan, R., University of Toronto

In the rapidly evolving field of biochemical engineering, predicting functions arising from compound-protein interactions remains a critical challenge. Recent deep learning models have made significant strides in functional annotation, novel enzyme discovery, and metabolite identification. However, the complexity of compound-protein interactions persists. Compound-protein interaction (CPI) prediction grapples with intricacies due to sparse data sources, heterogeneity, and the complex nature of interactions. Our work addresses this challenge by introducing CPI-Pred, a versatile deep learning model specifically designed for prediction compound-protein interactions.

We assemble the largest kinetic parameter datasets, encompassing four critical kinetic parameters: the Michaelis-Menten constant (K_M) containing ~85k datapoints, the enzyme turnover number (k_cat) containing ~45k datapoints, the catalytic efficiency (k_cat/K_M) containing ~20k datapoints, and the inhibition constant (KÂ_I) containing ~77k datapoints. These parameters are essential for understanding enzyme functionality within metabolic contexts and their regulation by compounds.

CPI-Pred combines novel compound representations, enzyme language models, and attention mechanisms. Compound representations are learned using message-passing neural network, capturing essential features of chemical compounds. Enzyme representations are extracted from state-of-the-art protein language models, encoding rich information about enzymes. Additionally, we incorporate novel sequence pooling and cross-attention mechanisms to enhance the modelâ€™s performance.

To address the inherent uncertainty in CPI predictions, we introduce a confidence predictor model. This auxiliary component assesses the confidence level associated with each interaction prediction. It evaluates factors such as data quality, model uncertainty, and input features, providing a confidence level score that quantifies the reliability of the CPI-Pred output.

Our model demonstrates robustness across diverse compound-protein interactions. By utilizing amino acid sequence and compound structure representations, CPI-Pred outperforms SOTA models on unseen compounds and dissimilar enzymes. The confidence predictor provides additional insights, allowing users to gauge the trustworthiness of individual predictions.

Our workflow holds promise for addressing various metabolic engineering challenges, including enzyme design, drug discovery, and personalized medicine. By combining CPI-Predâ€™s predictions with confidence assessments, researchers can make informed decisions and prioritize experimental validation. In summary, our integrated approach not only enhances prediction accuracy but also introduces a confidence assessment, bridging the gap between computational predictions and experimental validation.

Topics

Computational Molecular Engineering

Biological Engineering

Protein Engineering

Other Sites & Tools

Technical Groups

Technical

Professional/Personal Growth

Societal Needs

Leadership

2025 Spring Meeting and 21st Global Congress on Process Safety

2025 AIChE Annual Meeting

Upcoming Conferences & Events

CEP: December 2024

CEP: November 2024

Explore Areas of Advancement:

Learning Center:

Want to be an Entrepreneur? Personal Stories From Three Successful Entrepreneurs Who Have Traveled This Path.