2024 AIChE Annual Meeting

Predicting Amyloid Fibrillation through Transfer Learning

Checkout You must be logged in to view this content. Log in now.

Pricing

Individuals

List Price	225.00
AIChE Pro Members	150.00
AIChE Emeritus Members	105.00
AIChE Graduate Student Members	Free
AIChE Undergraduate Student Members	Free

Drug discovery is a complex multi-optimization problem that involves balancing the biological activity and developability of target compounds. Recent advances in generative machine learning models have contributed to the streamlining of these processes but lack precise control over target biophysical properties—many of which can be harmful to the development of a peptide therapeutic. Amyloid fibrils are a form of aggregate characterized by ordered stacks of peptide, which form a fibrous structure. In the discovery process, these aggregates are difficult to metabolize and are often less potent than free molecules. In this work, we develop a generalizable model that can predict amyloid fibrillation through protein language model (pLM) embeddings. We utilize ESM2—a pLM—to generate latent embeddings for sequences of interest, which are then passed to our model. Experimental data for fibrillation is limited and the largest public datasets consist of sequences that are much shorter than therapeutically relevant peptides. We explore transfer learning preprocessing strategies that allow us to effectively generalize to new sequence lengths, including mean-pooling and a modified convolutional neural network with attention weights—which is referred to as light attention (LA). Processed embeddings are passed to a standard multilayer perceptron (MLP) and predictions are scored against labelled data. These architectures demonstrate high predictive power when evaluated on two publicly available datasets and serve to expedite the development of peptide-based pharmaceuticals.

Breadcrumb

2024 AIChE Annual Meeting

Predicting Amyloid Fibrillation through Transfer Learning