(169co) A Based Take on Bayesian Optimization: Tuning Kernelized Bandits for Expensive Experiments with Mixed, Discrete Inputs
AIChE Annual Meeting
2024
2024 AIChE Annual Meeting
Computational Molecular Science and Engineering Forum
Poster Session: Computational Molecular Science and Engineering Forum
Monday, October 28, 2024 - 3:30pm to 5:00pm
The development of numerous extensions has facilitated the application of BO across complex, high-dimensional, and mixed search spaces [5], [6], [7], [8]. Moreover, there has been a concerted effort towards the effective implementation of BO, particularly for automating closed-loop High Throughput Experimentation processes [9], [10], [11], [12]. However, the adaptation of kernelized bandits to specifically guide laboratory experimentation and manage resource-intensive simulations has received relatively scant attention. Traditional BO implementations often seek to balance the number of iterations against computational time. Yet, in the realm of laboratory experiments and expensive simulations, prioritizing a reduction in iteration countâeven at the cost of increased computational timeâcan offer substantial benefits.
Additionally, while there has been progress in optimizing mixed search spaces that incorporate categorical, ordinal, or binary inputs [5], [6], [7], [8], the treatment of discrete numerical inputs has received comparatively less attention. The growing accessibility and utilization of chemical descriptors, encodings, or embeddings [13], [14], coupled with the constraints imposed by some laboratory setups on searching over a fully continuous parameter space, necessitate a refined approach for this category of inputs.
This congress contribution explores various methodologies, namely probabilistic reparameterization, interleaved optimization and continuous relaxation, for optimizing the acquisition function in search spaces that include discrete numerical inputs. This step is considered crucial for adapting traditional BO approaches to non-continuous inputs [15], [16]. Additionally, we are testing a novel type-dependent embedding strategy that combines random matrix embeddings for continuous inputs [6], [17], [18] with traditional, bi-directional methods for complexity reduction, specifically, Principal Component Analysis, in discrete properties. This approach facilitates application in high-dimensional search spaces, which is particularly pertinent when selecting a large array of descriptors to represent each class of reaction agents.
Despite BO's non-parametric status, the workflow of chemical engineers and chemists encompasses numerous considerations, such as the original encoding or embedding of chemical species and the initialization strategy, which significantly impact the optimization process's efficacy [3]. Our research also delves into the influence of these factors.
An algorithm, informed by our findings and benchmarked against prevailing BO strategies in a series of experiments within chemical engineering and synthesis, has been developed. The algorithm's open-source codebase is constructed using the PyTorch [19] family, enabling seamless integration with established frameworks such as BoTorch [20] and GPyTorch [21].
This contribution aims not only to advance the theoretical understanding and practical application of BO in guiding laboratory experimentation and simulations but also to foster the development of more efficient and effective optimization strategies in the chemical sciences.
References
[1] J. A. G. Torres et al., âA Multi-Objective Active Learning Platform and Web App for Reaction Optimizationâ, J. Am. Chem. Soc., vol. 144, no. 43, pp. 19999â20007, Nov. 2022, doi: 10.1021/jacs.2c08592.
[2] K. Wang and A. W. Dowling, âBayesian optimization for chemical products and functional materialsâ, Current Opinion in Chemical Engineering, vol. 36, p. 100728, Jun. 2022, doi: 10.1016/j.coche.2021.100728.
[3] B. J. Shields et al., âBayesian reaction optimization as a tool for chemical synthesisâ, Nature, vol. 590, no. 7844, pp. 89â96, Feb. 2021, doi: 10.1038/s41586-021-03213-y.
[4] F. Häse, L. M. Roch, and A. Aspuru-Guzik, âChimera: enabling hierarchy based multi-objective optimization for self-driving laboratoriesâ, Chem. Sci., vol. 9, no. 39, pp. 7642â7655, 2018, doi: 10.1039/C8SC02239A.
[5] X. Wan, V. Nguyen, H. Ha, B. Ru, C. Lu, and M. A. Osborne, âThink Global and Act Local: Bayesian Optimisation over High-Dimensional Categorical and Mixed Search Spacesâ. arXiv, Jun. 10, 2021. Accessed: Nov. 16, 2023. [Online]. Available: http://arxiv.org/abs/2102.07188
[6] L. Papenmeier, L. Nardi, and M. Poloczek, âBounce: a Reliable Bayesian Optimization Algorithm for Combinatorial and Mixed Spacesâ. arXiv, Jul. 02, 2023. Accessed: Aug. 25, 2023. [Online]. Available: http://arxiv.org/abs/2307.00618
[7] C. Oh, J. M. Tomczak, E. Gavves, and M. Welling, âCombinatorial Bayesian Optimization using the Graph Cartesian Productâ. arXiv, Oct. 28, 2019. Accessed: Apr. 07, 2024. [Online]. Available: http://arxiv.org/abs/1902.00448
[8] A. Deshwal, S. Ament, M. Balandat, E. Bakshy, J. R. Doppa, and D. Eriksson, âBayesian Optimization over High-Dimensional Combinatorial Spaces via Dictionary-based Embeddingsâ, Mar. 2023.
[9] O. J. Kershaw et al., âMachine learning directed multi-objective optimization of mixed variable chemical systemsâ, Chemical Engineering Journal, vol. 451, p. 138443, Jan. 2023, doi: 10.1016/j.cej.2022.138443.
[10] A. M. K. Nambiar, C. P. Breen, T. Hart, T. Kulesza, T. F. Jamison, and K. F. Jensen, âBayesian Optimization of Computer-Proposed Multistep Synthetic Routes on an Automated Robotic Flow Platformâ, ACS Cent. Sci., vol. 8, no. 6, pp. 825â836, Jun. 2022, doi: 10.1021/acscentsci.2c00207.
[11] M. Christensen et al., âData-science driven autonomous process optimizationâ, Commun Chem, vol. 4, no. 1, p. 112, Dec. 2021, doi: 10.1038/s42004-021-00550-x.
[12] A. E. Gongora et al., âA Bayesian experimental autonomous researcher for mechanical designâ, Sci. Adv., vol. 6, no. 15, p. eaaz1708, Apr. 2020, doi: 10.1126/sciadv.aaz1708.
[13] T. Gensch et al., âA Comprehensive Discovery Platform for Organophosphorus Ligands for Catalysisâ, J. Am. Chem. Soc., vol. 144, no. 3, pp. 1205â1217, Jan. 2022, doi: 10.1021/jacs.1c09718.
[14] J. Ross, B. Belgodere, V. Chenthamarakshan, I. Padhi, Y. Mroueh, and P. Das, âLarge-Scale Chemical Language Representations Capture Molecular Structure and Propertiesâ. arXiv, Dec. 14, 2022. Accessed: Apr. 07, 2024. [Online]. Available: http://arxiv.org/abs/2106.09553
[15] E. C. Garrido-Merchán and D. Hernández-Lobato, âDealing with Categorical and Integer-valued Variables in Bayesian Optimization with Gaussian Processesâ, Neurocomputing, vol. 380, pp. 20â35, Mar. 2020, doi: 10.1016/j.neucom.2019.11.004.
[16] S. Daulton, X. Wan, D. Eriksson, M. Balandat, M. A. Osborne, and E. Bakshy, âBayesian Optimization over Discrete and Mixed Spaces via Probabilistic Reparameterizationâ. arXiv, Oct. 18, 2022. Accessed: Oct. 18, 2023. [Online]. Available: http://arxiv.org/abs/2210.10199
[17] J. Kirschner, M. Mutný, N. Hiller, R. Ischebeck, and A. Krause, âAdaptive and Safe Bayesian Optimization in High Dimensions via One-Dimensional Subspacesâ. arXiv, May 28, 2019. Accessed: Apr. 08, 2023. [Online]. Available: http://arxiv.org/abs/1902.03229
[18] A. Munteanu, A. Nayebi, and M. Poloczek, âA Framework for Bayesian Optimization in Embedded Subspacesâ, 2019.
[19] P. Wu, âPyTorch 2.0: The Journey to Bringing Compiler Technologies to the Core of PyTorch (Keynote)â, in Proceedings of the 21st ACM/IEEE International Symposium on Code Generation and Optimization, Montréal QC Canada: ACM, Feb. 2023, pp. 1â1. doi: 10.1145/3579990.3583093.
[20] M. Balandat et al., âBOTORCH: A Framework for Efï¬cient Monte-Carlo Bayesian Optimizationâ, Advances in Neural Information Processing Systems, vol. 33, Dec. 2020.
[21] J. Gardner, G. Pleiss, K. Q. Weinberger, D. Bindel, and A. G. Wilson, âGPyTorch: Blackbox Matrix-Matrix Gaussian Process Inference with GPU Accelerationâ.