(712d) Multistep Lookahead Bayesian Optimization for High Dimensional Black-Box Optimization Problems Using Reinforcement Learning

Conference

AIChE Annual Meeting

Year

2024

Proceeding

Computing and Systems Technology Division

Session

Thursday, October 31, 2024 - 4:33pm to 4:54pm

Authors

Cheon, M. - Presenter

Koh, D. Y., Georgia Institute of Technology

Lee, J. H., University of Southern California

Tsay, C., Imperial C

Multistep lookahead Bayesian optimization for high dimensional black-box optimization problems using reinforcement learning

Mujin Cheon^a, Dong-yeun Koh^a^*, Jay H. Lee^b*, and Calvin Tsay^c*

^aDepartment of Chemical and Biomolecular Engineering, Korea Advanced Institute of Science and Technology, 291 Daehak-Ro, Yuseong-Gu, Daejeon, 34141, Republic of Korea

^bDepartment of Chemical and Biological Engineering, University of Southern California, Los Angeles, CA 90007, USA

^cDepartment of Computing, Imperial College London, London, SW7 2AZ, England, United Kingdom

Abstract:

Bayesian optimization (BO) is a popular decision-making tool in global optimization problems when the underlying system is not fully known and must be treated as â€œblack boxâ€. It excels in sequential decision-making by adeptly balancing exploration (learning more about the system) and exploitation (making the best decision based on current knowledge), which enables it to efficiently find global optima with minimal data. This attribute is particularly valuable in fields such as chemical engineering, where collecting experimental data can be both expensive and time-consuming [1]. Therefore, BO has seen explored as a design-of-experiments strategy in areas such as material discovery, reaction engineering, and process optimization[2-4]. Despite its effectiveness, standard BO methods only optimize for immediate next-step improvements, and thus do not directly assess the dependence between current data acquisition and subsequent experiments. This limitation can be critical when decision-making extends beyond a single step, e.g., in an experimental campaign [5]. In theory, to achieve a perfect balance between exploration and exploitation, one would need to solve the â€œmulti-step lookaheadâ€ stochastic dynamic programming (SDP) problem for BO. However, this is often infeasible due to the sheer computational complexity and resource requirements.

Several efforts have been undertaken to find approximate solutions for the SDP problem inherent in multi-step lookahead BO. These approaches generally fall into two categories: those that employ rollout techniques and those limited to a two-step lookahead. More recently, several works propose methods that incorporate Reinforcement Learning (RL) to tackle multi-step lookahead BO challenges [6-7]. However, these RL-based approaches have encountered scalability issues, primarily due to their representation in the state space.

In this study, we introduce a novel architecture that integrates end-to-end RL with BO for multi-step lookahead decision-making in high-dimensional, unknown environments. Specifically, we encode the current state of knowledge in BO, or a set of experiments, as a point in a latent space, using a proposed neural network architecture. A critical characteristic of our model is its permutation invariance; the particular sequence of data acquisition in chemical experiments does not impact our understanding of the system. The learned latent representation is then utilized by an RL agent to make multi-step lookahead decisions. Actions determined by the RL agent are executed within a virtual environment using a Gaussian Process (GP), and the rewards obtained from these virtual experiments are used to iteratively update both the encoder and the RL agent. We evaluate the performance of our proposed BO framework on several high-dimensional benchmark functions by comparing its performance against that of traditional BO, a high-dimensional BO algorithm, and another end-to-end BO algorithm. Our computational study reveals that the proposed method exhibits lower average regret values, indicating a faster identification of optimal solutions across various scenarios. This suggests that our proposed BO framework can significantly enhance the efficiency of sequential decision-making in unknown environments, accelerating the discovery of globally optimal solutions.

[1] Beg, S., Swain, S., Rahman, M., Hasnain, M. S., & Imam, S. S. (2019). Application of design of experiments (DoE) in pharmaceutical product and process optimization. In Pharmaceutical quality by design (pp. 43-64). Academic Press.

[2] Pruksawan, S., Lambard, G., Samitsu, S., Sodeyama, K., & Naito, M. (2019). Prediction and optimization of epoxy adhesive strength from a small dataset through active learning. Science and technology of advanced materials, 20(1), 1010-1021.

[3] Byun, H. E., Kim, B., & Lee, J. H. (2022). Multi-step lookahead Bayesian optimization with active learning using reinforcement learning and its application to data-driven batch-to-batch optimization. Computers & Chemical Engineering, 167, 107987.

[4] Paulson, J. A., & Tsay, C. (2024). Bayesian optimization as a flexible and efficient design framework for sustainable process systems. arXiv preprint arXiv:2401.16373.

[5] Lee, E., Eriksson, D., Bindel, D., Cheng, B., & Mccourt, M. (2020, August). Efficient rollout strategies for Bayesian optimization. In Conference on Uncertainty in Artificial Intelligence (pp. 260-269). PMLR.

[6] Byun, H. E., Kim, B., & Lee, J. H. (2022). Multi-step lookahead Bayesian optimization with active learning using reinforcement learning and its application to data-driven batch-to-batch optimization. Computers & Chemical Engineering, 167, 107987.

[7] Cheon, M., Byun, H., & Lee, J. H. (2022). Reinforcement Learning based Multiâ€Step Lookâ€Ahead Bayesian Optimization. IFAC-PapersOnLine, 55(7), 100-105.

Topics

Computing and Systems Engineering

Process Automation & Control

Process Design & Development

Other Sites & Tools

Technical Groups

Technical

Professional/Personal Growth

Societal Needs

Leadership

2025 Spring Meeting and 21st Global Congress on Process Safety

2025 AIChE Annual Meeting

Upcoming Conferences & Events

CEP: January 2025

CEP: December 2024

Explore Areas of Advancement:

Learning Center:

Want to be an Entrepreneur? Personal Stories From Three Successful Entrepreneurs Who Have Traveled This Path.

(712d) Multistep Lookahead Bayesian Optimization for High Dimensional Black-Box Optimization Problems Using Reinforcement Learning

AIChE Annual Meeting

2024

2024 AIChE Annual Meeting

Computing and Systems Technology Division

10C: Data-driven Optimization

Thursday, October 31, 2024 - 4:33pm to 4:54pm

Authors

Topics

More Conference Links

Cancelation Policy

Code of Conduct

Beware of Hotel and Attendee-list Scams