(169co) A Based Take on Bayesian Optimization: Tuning Kernelized Bandits for Expensive Experiments with Mixed, Discrete Inputs

Conference

AIChE Annual Meeting

Year

2024

Proceeding

2024 AIChE Annual Meeting

Group

Computational Molecular Science and Engineering Forum

Session

Poster Session: Computational Molecular Science and Engineering Forum

Time

Monday, October 28, 2024 - 3:30pm to 5:00pm

Authors

Griehl, C. - Presenter

Sundmacher, K., Max Planck Institute for Dynamics of Complex Technical Systems

In recent years, Bayesian Optimization (BO), also recognized as kernelized bandits, has emerged as a powerful tool, transcending the boundaries of traditional informatics to find successful applications in diverse fields such as drug discovery, material design, and chemical synthesis [1], [2], [3], [4]. This widespread adoption is largely due to the method's non-parametric nature and its efficiency regarding the number of evaluations.

The development of numerous extensions has facilitated the application of BO across complex, high-dimensional, and mixed search spaces [5], [6], [7], [8]. Moreover, there has been a concerted effort towards the effective implementation of BO, particularly for automating closed-loop High Throughput Experimentation processes [9], [10], [11], [12]. However, the adaptation of kernelized bandits to specifically guide laboratory experimentation and manage resource-intensive simulations has received relatively scant attention. Traditional BO implementations often seek to balance the number of iterations against computational time. Yet, in the realm of laboratory experiments and expensive simulations, prioritizing a reduction in iteration countâ€”even at the cost of increased computational timeâ€”can offer substantial benefits.

Additionally, while there has been progress in optimizing mixed search spaces that incorporate categorical, ordinal, or binary inputs [5], [6], [7], [8], the treatment of discrete numerical inputs has received comparatively less attention. The growing accessibility and utilization of chemical descriptors, encodings, or embeddings [13], [14], coupled with the constraints imposed by some laboratory setups on searching over a fully continuous parameter space, necessitate a refined approach for this category of inputs.

This congress contribution explores various methodologies, namely probabilistic reparameterization, interleaved optimization and continuous relaxation, for optimizing the acquisition function in search spaces that include discrete numerical inputs. This step is considered crucial for adapting traditional BO approaches to non-continuous inputs [15], [16]. Additionally, we are testing a novel type-dependent embedding strategy that combines random matrix embeddings for continuous inputs [6], [17], [18] with traditional, bi-directional methods for complexity reduction, specifically, Principal Component Analysis, in discrete properties. This approach facilitates application in high-dimensional search spaces, which is particularly pertinent when selecting a large array of descriptors to represent each class of reaction agents.

Despite BO's non-parametric status, the workflow of chemical engineers and chemists encompasses numerous considerations, such as the original encoding or embedding of chemical species and the initialization strategy, which significantly impact the optimization process's efficacy [3]. Our research also delves into the influence of these factors.

An algorithm, informed by our findings and benchmarked against prevailing BO strategies in a series of experiments within chemical engineering and synthesis, has been developed. The algorithm's open-source codebase is constructed using the PyTorch [19] family, enabling seamless integration with established frameworks such as BoTorch [20] and GPyTorch [21].

This contribution aims not only to advance the theoretical understanding and practical application of BO in guiding laboratory experimentation and simulations but also to foster the development of more efficient and effective optimization strategies in the chemical sciences.

References

[1] J. A. G. Torres et al., â€˜A Multi-Objective Active Learning Platform and Web App for Reaction Optimizationâ€™, J. Am. Chem. Soc., vol. 144, no. 43, pp. 19999â€“20007, Nov. 2022, doi: 10.1021/jacs.2c08592.

[2] K. Wang and A. W. Dowling, â€˜Bayesian optimization for chemical products and functional materialsâ€™, Current Opinion in Chemical Engineering, vol. 36, p. 100728, Jun. 2022, doi: 10.1016/j.coche.2021.100728.

[3] B. J. Shields et al., â€˜Bayesian reaction optimization as a tool for chemical synthesisâ€™, Nature, vol. 590, no. 7844, pp. 89â€“96, Feb. 2021, doi: 10.1038/s41586-021-03213-y.

[4] F. HÃ¤se, L. M. Roch, and A. Aspuru-Guzik, â€˜Chimera: enabling hierarchy based multi-objective optimization for self-driving laboratoriesâ€™, Chem. Sci., vol. 9, no. 39, pp. 7642â€“7655, 2018, doi: 10.1039/C8SC02239A.

[5] X. Wan, V. Nguyen, H. Ha, B. Ru, C. Lu, and M. A. Osborne, â€˜Think Global and Act Local: Bayesian Optimisation over High-Dimensional Categorical and Mixed Search Spacesâ€™. arXiv, Jun. 10, 2021. Accessed: Nov. 16, 2023. [Online]. Available: http://arxiv.org/abs/2102.07188

[6] L. Papenmeier, L. Nardi, and M. Poloczek, â€˜Bounce: a Reliable Bayesian Optimization Algorithm for Combinatorial and Mixed Spacesâ€™. arXiv, Jul. 02, 2023. Accessed: Aug. 25, 2023. [Online]. Available: http://arxiv.org/abs/2307.00618

[7] C. Oh, J. M. Tomczak, E. Gavves, and M. Welling, â€˜Combinatorial Bayesian Optimization using the Graph Cartesian Productâ€™. arXiv, Oct. 28, 2019. Accessed: Apr. 07, 2024. [Online]. Available: http://arxiv.org/abs/1902.00448

[8] A. Deshwal, S. Ament, M. Balandat, E. Bakshy, J. R. Doppa, and D. Eriksson, â€˜Bayesian Optimization over High-Dimensional Combinatorial Spaces via Dictionary-based Embeddingsâ€™, Mar. 2023.

[9] O. J. Kershaw et al., â€˜Machine learning directed multi-objective optimization of mixed variable chemical systemsâ€™, Chemical Engineering Journal, vol. 451, p. 138443, Jan. 2023, doi: 10.1016/j.cej.2022.138443.

[10] A. M. K. Nambiar, C. P. Breen, T. Hart, T. Kulesza, T. F. Jamison, and K. F. Jensen, â€˜Bayesian Optimization of Computer-Proposed Multistep Synthetic Routes on an Automated Robotic Flow Platformâ€™, ACS Cent. Sci., vol. 8, no. 6, pp. 825â€“836, Jun. 2022, doi: 10.1021/acscentsci.2c00207.

[11] M. Christensen et al., â€˜Data-science driven autonomous process optimizationâ€™, Commun Chem, vol. 4, no. 1, p. 112, Dec. 2021, doi: 10.1038/s42004-021-00550-x.

[12] A. E. Gongora et al., â€˜A Bayesian experimental autonomous researcher for mechanical designâ€™, Sci. Adv., vol. 6, no. 15, p. eaaz1708, Apr. 2020, doi: 10.1126/sciadv.aaz1708.

[13] T. Gensch et al., â€˜A Comprehensive Discovery Platform for Organophosphorus Ligands for Catalysisâ€™, J. Am. Chem. Soc., vol. 144, no. 3, pp. 1205â€“1217, Jan. 2022, doi: 10.1021/jacs.1c09718.

[14] J. Ross, B. Belgodere, V. Chenthamarakshan, I. Padhi, Y. Mroueh, and P. Das, â€˜Large-Scale Chemical Language Representations Capture Molecular Structure and Propertiesâ€™. arXiv, Dec. 14, 2022. Accessed: Apr. 07, 2024. [Online]. Available: http://arxiv.org/abs/2106.09553

[15] E. C. Garrido-MerchÃ¡n and D. HernÃ¡ndez-Lobato, â€˜Dealing with Categorical and Integer-valued Variables in Bayesian Optimization with Gaussian Processesâ€™, Neurocomputing, vol. 380, pp. 20â€“35, Mar. 2020, doi: 10.1016/j.neucom.2019.11.004.

[16] S. Daulton, X. Wan, D. Eriksson, M. Balandat, M. A. Osborne, and E. Bakshy, â€˜Bayesian Optimization over Discrete and Mixed Spaces via Probabilistic Reparameterizationâ€™. arXiv, Oct. 18, 2022. Accessed: Oct. 18, 2023. [Online]. Available: http://arxiv.org/abs/2210.10199

[17] J. Kirschner, M. MutnÃ½, N. Hiller, R. Ischebeck, and A. Krause, â€˜Adaptive and Safe Bayesian Optimization in High Dimensions via One-Dimensional Subspacesâ€™. arXiv, May 28, 2019. Accessed: Apr. 08, 2023. [Online]. Available: http://arxiv.org/abs/1902.03229

[18] A. Munteanu, A. Nayebi, and M. Poloczek, â€˜A Framework for Bayesian Optimization in Embedded Subspacesâ€™, 2019.

[19] P. Wu, â€˜PyTorch 2.0: The Journey to Bringing Compiler Technologies to the Core of PyTorch (Keynote)â€™, in Proceedings of the 21st ACM/IEEE International Symposium on Code Generation and Optimization, MontrÃ©al QC Canada: ACM, Feb. 2023, pp. 1â€“1. doi: 10.1145/3579990.3583093.

[20] M. Balandat et al., â€˜BOTORCH: A Framework for Efï¬cient Monte-Carlo Bayesian Optimizationâ€™, Advances in Neural Information Processing Systems, vol. 33, Dec. 2020.

[21] J. Gardner, G. Pleiss, K. Q. Weinberger, D. Bindel, and A. G. Wilson, â€˜GPyTorch: Blackbox Matrix-Matrix Gaussian Process Inference with GPU Accelerationâ€™.

Topics

Computational Molecular Engineering

Catalysis

Computing and Systems Engineering

Other Sites & Tools

Technical Groups

Technical

Professional/Personal Growth

Societal Needs

Leadership

2025 Spring Meeting and 21st Global Congress on Process Safety

2025 AIChE Annual Meeting

Upcoming Conferences & Events

CEP: December 2024

CEP: November 2024

Explore Areas of Advancement:

Learning Center:

Want to be an Entrepreneur? Personal Stories From Three Successful Entrepreneurs Who Have Traveled This Path.