(594f) Enhancing Compound-Protein Interaction Prediction with Confidence Assessment
AIChE Annual Meeting
2024
2024 AIChE Annual Meeting
Food, Pharmaceutical & Bioengineering Division
Machine Learning Based Protein Engineering
Wednesday, October 30, 2024 - 5:22pm to 5:40pm
We assemble the largest kinetic parameter datasets, encompassing four critical kinetic parameters: the Michaelis-Menten constant (KM) containing ~85k datapoints, the enzyme turnover number (kcat) containing ~45k datapoints, the catalytic efficiency (kcat/KM) containing ~20k datapoints, and the inhibition constant (KÂI) containing ~77k datapoints. These parameters are essential for understanding enzyme functionality within metabolic contexts and their regulation by compounds.
CPI-Pred combines novel compound representations, enzyme language models, and attention mechanisms. Compound representations are learned using message-passing neural network, capturing essential features of chemical compounds. Enzyme representations are extracted from state-of-the-art protein language models, encoding rich information about enzymes. Additionally, we incorporate novel sequence pooling and cross-attention mechanisms to enhance the modelâs performance.
To address the inherent uncertainty in CPI predictions, we introduce a confidence predictor model. This auxiliary component assesses the confidence level associated with each interaction prediction. It evaluates factors such as data quality, model uncertainty, and input features, providing a confidence level score that quantifies the reliability of the CPI-Pred output.
Our model demonstrates robustness across diverse compound-protein interactions. By utilizing amino acid sequence and compound structure representations, CPI-Pred outperforms SOTA models on unseen compounds and dissimilar enzymes. The confidence predictor provides additional insights, allowing users to gauge the trustworthiness of individual predictions.
Our workflow holds promise for addressing various metabolic engineering challenges, including enzyme design, drug discovery, and personalized medicine. By combining CPI-Predâs predictions with confidence assessments, researchers can make informed decisions and prioritize experimental validation. In summary, our integrated approach not only enhances prediction accuracy but also introduces a confidence assessment, bridging the gap between computational predictions and experimental validation.