(712g) Removing the Human from Human-in-the-Loop Bayesian Optimization Via Multimodal Large Language Models, or There and Back Again
AIChE Annual Meeting
2024
2024 AIChE Annual Meeting
Computing and Systems Technology Division
10C: Data-driven Optimization
Thursday, October 31, 2024 - 5:36pm to 5:57pm
Bayesian optimization has been successfully applied for the fully-automated optimization of expensive, derivative-free, and black box functions within chemical engineering [1,2]. Due to the large amounts of existing domain knowledge inherent within engineering systems, approaches have been proposed that incorporate expert collaboration or feedback [3, 4]. The majority of these approaches rely on an expert specifying a unique solution at each iteration [5] or proposing a prior distribution of potentially optimal solutions [6], each with practical limitations. Recently, Savage & Chanona [3] proposed an approach that relies on the expression of a preference between alternative solutions provided algorithmically, balancing the benefits of continuous expert input with the simplicity of a discrete decision. Importantly their results highlighted that for a large class of problems, so long as the expert expresses the correct preference for a solution on average better than randomly, convergence is improved upon when compared to standard Bayesian optimization.
Our proposed methodology applies the hypothesis that large language models can effectively replace human experts in making discrete choices between candidate solutions in collaborative Bayesian optimization. At each iteration, we formulate and solve a high-throughput Bayesian optimization problem across multiple potential solutions, using multi-objective optimization to maximize both the utility values and the information spread across the batch. By selecting the solution at the knee point of the resulting Pareto front, we ensure that the presented alternatives offer a balance between expected improvement and distinctiveness.
The language model is then tasked with selecting its preferred solution from this set of alternatives, allowing it to incorporate its inherent knowledge and understanding into the decision-making process. To facilitate effective decision-making, we employ prompt engineering techniques that provide the language model with relevant context and examples, enabling it to make informed choices based on its knowledge base.
In addition to providing textual descriptions of the alternative solutions, we generate problem specific visualizations of each solution, reasonably available in the real world. These visualizations are then provided to an image model, such as CLIP (Contrastive Language-Image Pre-training) [7], which can reason over the visual features and provide additional insights into the distinguishing characteristics of each solution. By combining the textual and visual information, the language model and the image model can collaboratively make more informed decisions, combining their complementary strengths in understanding and analyzing complex solution spaces.
Our prompt engineering approach applies few-shot prompting, where we include a small number of representative examples within the prompt to guide the language model towards understanding the balance between solution predicted mean and standard deviation, and response format. These examples demonstrate how to select the best solution from a set of alternatives and provide a concise justification for the choice. This approach has recently been enabled by long-context windows of modern LLMs, eliminating the need for expensive and domain specific fine-tuning.
The extended context windows of state-of-the-art LLMs such as Anthropicâs Claude Haiku allow us to provide the language model with a comprehensive overview of the optimization problem, including the objective function, variable descriptions, and relevant domain knowledge. By incorporating this context into the prompt, we enable the language model to make decisions that are well-informed and aligned to the specific requirements of the problem. In addition to the problem context, we also include the previous iterations of the optimization process within the prompt. This information, presented as a JSON object, contains the inputs and corresponding objective function values for each past iteration. By providing this historical data, we allow the language model to infer patterns and relationships between the input variables and the objective function, further enhancing its decision-making capabilities.
The chosen solution is subsequently evaluated and added to the dataset, informing the next iteration of the optimization loop. By involving the language model in this discrete selection step, the methodology enables continuous integration of knowledge whilst remaining fully automated, making it practical for real-world applications. Figure 1 demonstrates our methodology.
We apply our approach to three distinct problems in chemical engineering and materials science: perovskite stability optimization, silver nanoparticle synthesis, and coiled-tube reactor optimization. These problems share the common characteristic of being black-box optimization problems, where the underlying objective function is unknown or too complex to be expressed analytically. However, each problem has specific domain knowledge that can be taken advantage of to guide the optimization process.
In the perovskite stability optimization problem, the goal is to minimize the instability index of perovskite thin-films under accelerated degradation conditions. The silver nanoparticle synthesis problem aims to optimize the flow rate ratios of reactants and the total flow rate in a microfluidic setup to achieve targeted absorbance spectra. The pulsed-flow helical tube reactor design problem seeks to optimize design parameters such as coil radius, coil pitch, inversion, oscillation amplitude, and frequency to maximize the equivalent number of tanks-in-series, which represents a closer approximation to plug-flow behavior and better mixing. To formulate these black-box optimization problems, we construct Gaussian process (GP) models using datasets containing input variables and corresponding objective function values, treating the resulting mean function as the true underlying function.
Across these case studies we demonstrate the potential for language model-guided Bayesian optimization to match or even surpass the convergence rate of standard Bayesian optimization. Across the perovskite stability optimization, silver nanoparticle synthesis, and pulsed-flow helical tube reactor design problems, our approach converges to optimal solutions in the same or fewer iterations. By effectively replacing human experts in the loop, our methodology offers a fully automated, interpretable, and knowledge-driven approach to optimization in chemical engineering and materials science, opening new possibilities for accelerating discovery and optimization in a wide range of engineering and scientific fields. Figure 2 demonstrates results from the optimization of pulsed-flow helical tube reactor design, with reasons generated by the LLM at simultaneously alongside solution selection, presenting a level of enhanced interpretability over standard in-silico Bayesian optimization without human input.
In addition to benchmarking our approach across chemical engineering domain-specific problems, we conduct a series of experiments to compare the performance of different prompts and evaluate the potential of our approach as a benchmark for the use of LLMs within chemical engineering and engineering systems.
Firstly, we investigate the impact of prompt design on the optimization process by comparing the convergence rates and solution quality achieved using different prompting strategies. We explore variations in the number and selection of representative examples, the inclusion of domain-specific knowledge, and the structure of the prompt itself. Our results show that carefully crafted prompts, which strike a balance between providing sufficient context and allowing for flexibility in the language model's decision-making, lead to improved optimization performance. These findings highlight the importance of prompt engineering in effectively taking advantage of the capabilities of LLMs for optimization tasks.
Secondly, we propose our language model-guided Bayesian optimization approach as a potential benchmark for evaluating the performance of different LLMs within the context of chemical engineering and engineering systems. By applying our methodology to a diverse set of optimization problems and comparing the results obtained using various state-of-the-art LLMs, such as OpenAIâs GPT-4, Anthropic's Claude, and Googleâs Gemini, we provide insights into the strengths and limitations of each model in handling complex, domain-specific optimization tasks. This benchmarking effort aims to establish a standardized framework for assessing the suitability and effectiveness of LLMs in the optimization of engineering systems, enabling researchers and practitioners to make informed decisions when selecting and deploying these models in their specific applications.
The inclusion of these experiments and benchmarking efforts not only addresses some of the limitations of the current work but also opens new avenues for research in the integration of LLMs into the optimization of complex engineering systems. By providing a comprehensive evaluation of prompt design strategies and establishing a benchmark for LLM performance, this work contributes to the development of more effective and efficient optimization methodologies, ultimately accelerating innovation and discovery in chemical engineering.