(2eb) Exploring Antibody Design Space with Deep Learning Models | AIChE

(2eb) Exploring Antibody Design Space with Deep Learning Models

Authors 

Mahajan, S. P. - Presenter, Johns Hopkins University
Research Interests

In the last decade, protein structural biology has come of age with groundbreaking advances in structure determination techniques such as Cryo-EM1 and Artificial Intelligence (AI) accelerated computational models2,3. We are in an age of abundance of structure and sequence data.4–6 And while AI models have found unprecedented success in predicting structure and will continue to do so for related problems over the next decade, we are only beginning to parse the underlying sequence7,8 and structural motifs (or patterns) that lead to function and dysfunction in different biological contexts. AI models, especially deep learning (DL) models can be applied to learn meaningful and rich representations of the building blocks of proteins at the sequence and structural level. Such models will eventually emerge to encode the complex patterns that capture evolution, diversification and continued fine-tuning of structure and sequence motifs to effect complex function and enable life at the level of cells, tissues and systems.

I aim to apply AI tools such as deep learning to learn rich meaningful representations to build bottom-up models describing protein structural and functional motifs in different contexts. Furthermore, I aim to tweak, modulate, repurpose, and engineer these motifs to 1) recapitulate known structure-function relationships, 2) investigate how they contribute to normal and diseased states of proteins and protein-protein systems, and 3) extrapolate them to discover and engineer de novo structure-function relationships. As a Principal Investigator, I aim to pursue to the following research goals:

  1. Investigate whether the sequence and structural motifs at antibody-antigen interfaces are learnable in a manner that allows us to engineer antibodies to target specific antigens.
  2. Determine the structural determinants of high specificity or high-affinity interactions at the antibody antigen and protein-protein interfaces
  3. Apply learned representations of sequence and structural motifs at interfaces to engineer bottom-up designs of antibody-antigen interfaces with context-specific motifs that encode desired properties
  4. Decipher the sequence and structural motifs that are determinants of the “self” and “non-self” recognition by the immune system
  5. Implement deep learning models to learn structural motifs that determine various mechanisms that encode enzyme specificity to enable bottom-up design of de novo enzymes with finely tuned specificities

References

  1. Herzik Jr, M. A. Cryo-electron microscopy reaches atomic resolution. Nature 587, 39–40 (2020).
  2. Gao, W., Mahajan, S. P., Sulam, J. & Gray, J. J. Deep Learning in Protein Structural Modeling and Design. Patterns 1, 100142 (2020).
  3. Jumper, J. et al. Highly accurate protein structure prediction with AlphaFold. Nature (2021) doi:10.1038/s41586-021-03819-2.
  4. Varadi, M. et al. AlphaFold Protein Structure Database: massively expanding the structural coverage of protein-sequence space with high-accuracy models. Nucleic Acids Res. 50, D439–D444 (2022).
  5. Suzek, B. E., Wang, Y., Huang, H., McGarvey, P. B. & Wu, C. H. UniRef clusters: a comprehensive and scalable alternative for improving sequence similarity searches. Bioinformatics 31, 926–932 (2015).
  6. Hsu, C. et al. Learning inverse folding from millions of predicted structures. 2, 1–22 (2022).
  7. Meier, J. et al. Language models enable zero-shot prediction of the effects of mutations on protein function. bioRxiv 2021.07.09.450648 (2021).
  8. Rives, A. et al. Biological Structure and Function Emerge from Scaling Unsupervised Learning to 250 Million Protein Sequences. bioRxiv 622803 (2019) doi:10.1101/622803.

Teaching Statement

Trained as a Chemical Engineer and having assisted many courses at the undergraduate and graduate levels, I look forward to developing and teaching courses in Chemical engineering. I am also excited to develop several graduate level courses such as Machine Learning for students of Chemical Engineers, Computational methods for protein engineering and Fundamentals of active learning for process design. I aim to design course material that caters to the needs of a diverse group of students with different learning abilities incorporating indirect and direct feedback to improve my teaching materials and technique. I also aim to update course materials to prepare students for a rapidly changing job market, incorporating the latest methods from artificial intelligence and data science.