(120d) Design of Diverse, Functional Mitochondrial Targeting Sequences across Eukaryotic Organisms Using Variational Autoencoders | AIChE

(120d) Design of Diverse, Functional Mitochondrial Targeting Sequences across Eukaryotic Organisms Using Variational Autoencoders

Authors 

Zhao, H., University of Illinois-Urbana
Zaidi, A., University of Illinois at Urbana-Champaign
Singh, N., Carl R. Woese Institute for Genomic Biology, University of Illinois at Urbana-Champaign
Xue, X., Department of Plant Biology, University of Illinois at Urbana-Champaign
Chen, L. Q., Department of Plant Biology, University of Illinois at Urbana-Champaign
A eukaryotic cell consists of a well-defined architecture of organelles carrying biochemically distinct cellular processes. These membrane-bound compartments provide nuclear encoded proteins a physiological context to carry out their function. Recently, utilizing the organelle environment and machinery has been of interest for applications in fabricating compartments for biochemical production and developing new methods of therapeutic intervention. The multifaceted contributions of mitochondria to cellular metabolism have made it an important target not only for metabolic engineering but also for disease treatment. However, only a few protein-localization tags have been characterized for mitochondrial localization even though the targeting efficiency is known to be influenced by the passenger protein. To address this limitation, we exploited generative models, an unsupervised machine learning framework to design a toolkit of mitochondrial targeting sequences (MTS) followed by in-vivo characterization in multiple organisms. A variational autoencoder was constructed to sample artificial sequences. A high fraction of generated peptide sequences (95%) were predicted to be functional by TargetP2.0. Furthermore, using UniRep model embeddings, clustering was implemented to assign an ‘organism’ label for in-vivo testing. The artificial sequences were added to the N-terminus of fluorescent proteins and confocal microscopy was performed for characterization. Success rate of 50%, 100% and 75% were obtained for Saccharomyces cerevisiae, HEK293 and Nicotiana tabacum, respectively. The functional sequences are highly diverse with normalized edit distance greater than 50% when compared to the MTS reported in UniProt database. Overall, these results suggest highly functional, diverse peptides capable of localizing proteins in mitochondrial matrix of diverse organisms can be designed. Currently, we are implementing this strategy for designing artificial peptides to perform protein localization in mitochondrial matrix of non-model organisms like Rhodosporidium toruloides. We also provide a list of potentially functional MTS for various organisms to expand the metabolic engineering toolbox for mitochondrial compartmentalization. This work should aid in enhancing titers of biochemicals, expressing functional heterologous proteins, and improving mitochondrial DNA editing and biomolecule delivery capabilities.