(622c) Development of a Naloxone Biosynthetic Pathway Using Cell Free Protein Synthesis and Deep Learning | AIChE

(622c) Development of a Naloxone Biosynthetic Pathway Using Cell Free Protein Synthesis and Deep Learning

Authors 

Varner, J. D. - Presenter, Cornell University
Zhang, Z., Cornell University
Introduction

Opioids are a class of drugs highly valued for their potent analgesic properties; however, they are also highly addictive and cause severe side effects, including death as a result of respiratory depression. The Centers for Disease Control and Prevention estimate that 130 Americans die daily from opioid overdoses and that the number of opioid overdose deaths in 2017 represents a six-fold increase compared to 1999. Alternative manufacturing methods may be the disruption needed to overcome the cost and availability issues associated with naloxone, the opioid overdose antidote. Toward this need, we propose the development of a biosynthetic route of naloxone production with morphine as the precursor. In this work, we produced morphine dehydrogenase - an enzyme that catalyzes the oxidation of morphine to morphinone - by cell free protein synthesis (CFPS), taking advantage of the speed of CFPS compared to cell-based culture. We then developed a computational tool to design biosynthetic routes from morphinone to naloxone. In particular, to facilitate our experimental design, we combined deep learning with computational biochemistry and developed an in silico framework for efficiently exploring the designing space of biosynthetic pathways. Taken together, we developed an experimental protocol for the expression of the first enzyme in any potential naloxone biosynthetic pathway and a multistep retrobiosynthesis pipeline for pathway prediction that could be used to design routes to naloxone, and more generally could enable anti-opioid metabolic engineering strategies.

Methods

In this work, we turned to cell free protein synthesis (CFPS) as a platform for rapid expression of our desired enzyme. CFPS reactions consisted of linear DNA coding for C-terminally 6xHis-tagged MDH and GamS nuclease inhibitor added to myTXTL Sigma 70 Master Mix (ArborBiosciences, Ann Arbor, MI). Time points tested include 2h, 4h, and 6h, and at least three biological replicates were performed for each time point. Gene expression levels were determined by quantitative reverse transcriptase-PCR. An mRNA standard curve was used to determine absolute mRNA concentrations. Three technical qPCR replicates were performed for each standard and biological replicate sample. The His-tagged product was purified under native conditions using Ni-NTA spin columns and stored at -80°C until retrieved for further analysis. Protein was quantified by a Bradford assay, and SDS-PAGE was performed to confirm the molecular weight of the expressed protein.

To develop our computational biosynthetic pathway design framework, we assembled metabolic reaction and enzymatic template data from public databases. A data augmentation procedure, adapted from literature, was carried out to enrich the assembled reaction dataset with artificial metabolic reactions generated by enzymatic reaction templates. Two neural network-based pathway ranking models were trained as binary classifiers, by outputting a scalar quantifying the likelihood of a 1-step/2-step pathway being plausible, to distinguish assembled reactions from artificial counterparts. Combining these two models with enzymatic templates, we built a multistep retrobiosynthesis pipeline and validated it by reproducing some natural and non-natural pathways computationally.

Results

While the protein continued to accumulate, MDH mRNA concentrations peaked at 4h of incubation. Maximum MDH protein yield observed at 6h was 0.59 ± 0.003 mg/mL, which is on the order of magnitude reported by the cell free extract manufacturer (ArborBiosciences). The protein product appeared pure; there were no contaminating bands on the SDS-PAGE gels. The calculated molecular weight (MW) of each batch of product was within 0.9% error of our expected MW of 33.1 kDa for 6xHis-tagged MDH. The peak and subsequent decline of mRNA concentration indicate degradation; however, protein degradation did not appear to be a significant concern at this range of incubation periods. While our experimental workflow is promising, our pathway design process will be greatly assisted by a functioning retrobiosynthetic computational pipeline.

Our neural network based ranking models outperformed their corresponding baseline models by a significant margin and our multistep retrosynthesis pipeline was able to recover some reported pathways. We validated each neural network based ranking model by comparing its performance on testing dataset with corresponding baseline model. Tanimoto scores between reactant and product were used as a baseline in comparison with neural-network based 1-step ranking model (NN1PR). NN1PR was able to cover 55% dataset in top 10 candidates, while only 3% by the Tanimoto baseline. NN1PR was used as the baseline for our neural-network based 2-step ranking model (NN2PR) on ranking 2-step pathways. NN2PR covered 67% dataset in the top 10 candidates, although NN1PR could only cover 41%. NN2PR’s better performance over NN1PR proved the value of incorporating information from previous steps in pathway ranking. When running the multistep pipeline on the production of 1,4-BDO from 4-Hydroxybutyrate, our pipeline was able to find three candidate pathways that could fulfill the assigned task. Among them, one was an exact match as reported in the literature. Using only NN1PR as the ranking model in our pipeline, the exact match was ranked in the top 67%. While using both NN1PR and NN2PR as ranking models, its rank in percentage was improved to 7.4%. This observation consolidated the value of using multiple ranking models inside a pipeline. We also tested our multistep pipeline with the glycolysis pathway. The pipeline was given fructose-6-phosphate as the starting compound and explored backward to see if it could lead fructose-6-phosphate back to glucose. We found some predictions with interesting by-products and have been working on experimental validation.

Discussion

Although we were able to achieve a reasonable yield of morphine dehydrogenase using a cell free system, this study could be expanded upon in a few ways. We plan to increase cell free incubation times to determine the ideal CFPS duration. While it would be ideal to test the activity of our enzyme product on morphine, that requires state and federal licenses because it is a Schedule II controlled substance. While waiting for approvals, we plan to determine catalytic activity using an aldose reductase activity assay, as MDH is part of that enzyme family.

To further prove the effectiveness of the computational pipeline, we plan to experimentally validate some of the predictions made by using simple in-vitro multi step glycolysis pathways with fructose-6-phosphate as the starting compound. Enzymes, substrates, and cofactors will be combined in microfuge tubes and allowed to react. Reactions will be quenched at different time points and analyzed by liquid chromatography-mass spectrometry. While we expect the glycolysis reactions to proceed as usual, we are interested in the other predictions made by our multistep pipeline.

While the experimental validation of our prediction with the glycolysis pathway is a work in progress, many directions are under study to improve the accuracy, capability, and efficiency of our multistep computational pipeline. Instead of using ECFP, graphical neural network learnt fingerprints could provide better descriptions of molecules. Transfer learning could help develop NNnPR (n>2) models with little available data; incorporating a template generation module could make this pipeline more adaptive; inserting a template selection model could improve runtime efficiency by avoiding doing unnecessary cheminformatic computation. Evaluating feasibility of each step through some enzyme design tool and using its feedback for further exploration could be beneficial. Once we improve our computational framework with the aforementioned further developments, it will be used to guide the experimental synthesis of naloxone, in the hope of finding new and efficient biosynthetic pathways.