(594c) Explainable Support Vector Machine Models for Analyzing Structure-Function Relationships of Membrane-Active Peptides | AIChE

(594c) Explainable Support Vector Machine Models for Analyzing Structure-Function Relationships of Membrane-Active Peptides

Authors 

Song, H., Inha University
DeFreese, W., Auburn University
Kieslich, C., Auburn University
Due to the various drawbacks surrounding antibiotics and small molecule-based drug treatments, particularly their limited adaptability to combat harmful microbes, scientists have embarked on researching novel therapeutic methods employing alternative sources such as membrane-active peptides (MAPs). In this study, we conducted an in-depth machine-learning analysis on a diverse array of therapeutic peptides within the broad class of MAPs, focusing on their capacity to either traverse cellular membranes (cell-penetrating peptides, CPPs) or disrupt these membranes (membranolytic peptides, MDPs).

To accomplish this, we compiled datasets from the literature that encompass cell-penetrating, hemolytic, anticancer, antifungal, antiparasitic, antibacterial, antiviral, and mammalian targeting peptides. Leveraging known periodicities in peptide properties that correlate with their structure and function, we employed Fourier transforms to generate features by measuring the amplitude of amino acid property oscillations. We applied an in-house feature selection procedure, based on non-linear support vector machines (SVMs), to derive structure-function fingerprints for each class of MAPs.

As reference point, we developed models using a more traditional feature set that includes amino acid compositions, dipeptide compositions, and physiochemical properties of amino acids. Additionally, to gauge the performance of our approach with state-of-the-art models we compared our models to deep-learning models that were trained by fine tuning a protein language model (ESM2) to predict each class of MAPs.

Furthermore, we compared our predictions with recently developed models in the literature that were trained on the same datasets for each peptide class. Comparison of our approach, based on the Fourier transform, with the state-of-the-art shows that our approach leads to models with significantly fewer features with at least comparable performance. Finally, we use the derived structure-function fingerprints to cluster the classes of MAPs, which provides insight into the design of MAPs with improved specificity.

This innovative study holds the potential to expedite and revolutionize the design and development of novel membrane-active peptides, offering promising avenues for drug discovery and clinical trials.

Keywords: Membrane-active peptides, support vector machine models, sequential feature generation, feature selection.