(565b) Molecular Discovery with Limited Human Input Using a Machine Learning Guided Automated Platform | AIChE

(565b) Molecular Discovery with Limited Human Input Using a Machine Learning Guided Automated Platform

Authors 

McDonald, M. - Presenter, Georgia Tech
Koscher, B., Massachusetts Institute of Technology
Ha, S. K., MIT
Bilodeau, C., Massachusetts Institute of Technology
Jensen, K., Massachusetts Institute of Technology
Discovery of new small molecules often involves synthesizing and testing a rationally designed family of molecules based on a known biologically active motif. Recently, machine learning methods have led to drug discoveries by repurposing known/old compounds [1] and generating new molecules with targeted protein interactions [2]. Candidate small molecules should be moderately lipophilic and relatively non-toxic, as well as biologically potent. We have developed a well-plate-based platform for the automated optimization of molecular scaffolds across multiple properties. Machine learning property models are coupled with a predictive retrosynthetic model (ASKCOS) and robotic modules to suggest, synthesize, and test candidate molecules. The discovery process is iterative, with initial synthesis campaigns being used to retrain and improve property model predictions for a high-performer iteration. The properties used to demonstrate the platform are water/octanol partitioning [3], cytotoxicity based on HEK293T cells, and optical spectral properties [4], as a stand-in for plate-reader-based biological activity assays.

This talk will focus on efforts to improve automation to enable the entire discovery cycle—initial explorative target selection, synthesis, characterization, model refining/retraining, and best-performer synthesis—to be executed with limited human interaction. Liquid handling, reaction optimization, air-free synthesis, high-temperature synthesis, workup, and isolation, all in a well plate format, have been automated and interfaced to the ML models. This combined approach to small molecule exploration, where several properties are optimized automatically and simultaneously, could allow medicinal chemists to focus on scaffold/family discovery rather than more tedious scoping experiments. If evaluation of whole families of compounds were entirely automated, more effort could be applied to identifying targets for small molecule drugs.

[1] J. M. Stokes, et al., Cell, 2020, 180, 688-702.e613.

[2] Y. Bian and X.-Q. Xie, Journal of Molecular Modeling, 2021, 27, 71.

[3] F. H. Vermeire and W. H. Green, Chemical Engineering Journal, 2021, 418, 129307.

[4] K. P. Greenman, W. H. Green and R. Gomez-Bombarelli, Chemical Science, 2022.