(575b) Meta-Analysis of MHC Class I Peptide Binding Interactions Using SVM Models | AIChE

(575b) Meta-Analysis of MHC Class I Peptide Binding Interactions Using SVM Models

Authors 

Islam, S. - Presenter, Auburn University
Song, H., Inha University
Kieslich, C., Auburn University
Major Histocompatibility Complex (MHC) Class I and Class II play pivotal roles in the adaptive immune system. MHC Class I molecules, expressed on nucleated cells, display intracellularly-derived peptides to be recognized by CD8+ T-cells; while MHC Class II molecules, expressed on antigen-presenting cells (APCs), display peptides to be recognized by CD4+ T-cells. Thus, the binding of peptides derived from protein antigens to the MHC molecules is a prerequisite for T-cell immunogenicity. The ability to reliably predict MHC-binding peptides, and consequently T-cell epitopes, can have significant implications for areas of immunology like vaccine design and protein therapeutics. There are several machine-learning-based predictors available for the identification of immunogenic T-cell epitopes based on the binding properties of MHC class I and II proteins. Given the availability of these models, predicting the binding of peptides to MHC molecules remains a challenging feat given the polymorphic nature of the Human Leukocyte Antigens (HLA) alleles encoding for the MHC proteins. Understanding how peptide fragments bind to immune proteins, like MHC molecules, is crucial to combating any threat posed to humans by different pathogens.

Computational epitope predictors require the development of accurate statistical models that can calculate the descriptors for the interactions between the pathogenic peptides and the immune proteins. Our project involves training a multi-allele-specific support vector machine (SVM) model to classify MHC Class I binding/non-binding epitopes. A comprehensive dataset of 84,041 peptides binding to MHC class I with binding affinities for 94 unique MHC class I alleles was selected from the IEDB database for training and testing the model. The algorithm incorporates cross-validation using multiple training and testing datasets, application of Fourier transforms to the peptide sequences, identification of an essential set of predictive features, and tuning of the hyper-parameters. This binary classification model was developed for MHC Class II binding 15mers previously and has now been modified to train for MHC Class I binding 9mers and 10mers. The aim of this work is to develop a “fingerprint” of the types of peptides binding to a given MHC molecule using an approach for feature selection. Based on identifying those fingerprints, a clustering analysis based on the function, or binding preference, of the MHC molecules is performed. For the presentation, we will include the selection of the peptide datasets, fine-tuning the hyper-parameters, the feature selection criteria, and the model's accuracy. Additionally, we will be discussing the findings of the analysis and how they can provide novel insight into the susceptibility of individuals to different pathogens based on their HLA alleles.