(140b) Learning to Design and Validate Small-Molecule Synthetic Routes from Historical Reaction Data
AIChE Annual Meeting
2018
2018 AIChE Annual Meeting
Pharmaceutical Discovery, Development and Manufacturing Forum
Data Analytics for Process Prediction
Tuesday, October 30, 2018 - 12:55pm to 1:20pm
In this talk, we will describe our recent efforts to develop such software. The overarching theme of our work is how to most effectively leverage historical reaction data to inform decision-making in small molecule pathway design.
The overall synthesis planning workflow contains a number of interconnected modules. We focus on two critical aspects of computer-aided synthesis planning and how machine learning and other data-driven techniques have enabled new approaches to both challenges. First, we discuss the problem of retrosynthetic planning (i.e., identification of suitable starting materials) and how the recursive expansion and search strategy are both conducive to machine learning approaches. Second, we discuss the challenge of in silico reaction validation, which can be addressed by solving the inverse problem of forward reaction prediction. We summarize neural network-based approaches we have taken to develop models that can anticipate the products of a chemical reaction after being trained on previously published reactions. Here we use the model for reaction validation, but its utility extends to prediction of side products and impurities. Finally, we describe how these techniques for retrosynthesis and forward prediction are integrated into an overall workflow that, for a given molecular target, predicts a rank ordered list of reaction paths that connect the target to purchasable starting materials via a series of plausible reaction steps. The integrated program offers additional features for excluding specific reactions or chemicals, e.g., for IP or toxicity concerns.