(372g) Molecule Selection and Batch Synthesis Planning for Design-Make-Test Cycles
AIChE Annual Meeting
2023
2023 AIChE Annual Meeting
Pharmaceutical Discovery, Development and Manufacturing Forum
Advances in Drug Discovery Processes (including HTE)
Tuesday, November 7, 2023 - 10:05am to 10:30am
We address this challenge by formulating the selection of molecules and their synthetic routes as a constrained integer linear optimization problem. Previous work has explored synthesis planning for both cost [1] and batch efficiency [2], but these algorithms take the set of molecules to be synthesized as a known input. Our proposed optimization algorithm extends this work through a scalarized objective function that simultaneously considers the molecular design objective, batch synthesis complexity, and synthetic feasibility. Retrosynthesis trees are first constructed for each molecule, and a set of constraints are defined such that optimal decision variables correspond to synthetic routes included in the pre-defined trees. The constrained linear optimization problem is solved using the commercial solver Gurobi, and the optimal decision variables are converted to a set of selected molecules and synthetic routes. The proposed workflow is described in greater detail below:
- Each candidate is assigned a âutility,â which captures the perceived information gained from testing that molecule.
- Retrosynthesis trees are constructed for each candidate molecule using ASKCOS [3]. The retrosynthesis trees for all candidate molecules are subsequently combined into a reaction âforestâ, which is a graph that connects reaction nodes with molecule (starting material, intermediate, or candidate) nodes.
- A constrained linear optimization problem is formulated to select a set of molecules and their synthesis routes to optimize a scalarization of multiple objectives. The objectives are (1) to maximize the sum of the utilities of selected molecules, (2) to minimize the total number of starting materials and reagents, and (3) three to minimize the sum of reaction penalties, which capture the retrosynthesis modelâs level of uncertainty in a reaction.
- The optimal variables are converted to a set of selected molecules and synthetic routes.
We demonstrate that our algorithm is capable of scaling to candidate sets containing hundreds of molecules. Moreover, we show how adjusting scalarization weights affects the number of molecules selected, the number of overlapping reaction steps, and the retrosynthesis modelâs confidence in selected routes. Finally, we propose how the optimization task can be adjusted to minimize synthetic cost and maximize the probability of successful synthesis more rigorously.
[1] Badowski, T.; Molga, K.; A. Grzybowski, B. Selection of Cost-Effective yet Chemically Diverse Pathways from the Networks of Computer-Generated Retrosynthetic Plans. Chemical Science 2019, 10 (17), 4640â4651.
[2] Molga, K.; Dittwald, P.; Grzybowski, B. A. Computational Design of Syntheses Leading to Compound Libraries or Isotopically Labelled Targets. Chem. Sci. 2019, 10 (40), 9219â9232.
[3] Coley, C. W.; Thomas, D. A.; Lummiss, J. A. M.; Jaworski, J. N.; Breen, C. P.; Schultz, V.; Hart, T.; Fishman, J. S.; Rogers, L.; Gao, H.; Hicklin, R. W.; Plehiers, P. P.; Byington, J.; Piotti, J. S.; Green, W. H.; Hart, A. J.; Jamison, T. F.; Jensen, K. F. A Robotic Platform for Flow Synthesis of Organic Compounds Informed by AI Planning. Science 2019, 365 (6453), eaax1566.