(4qf) Large Language Model and Multimodal Learning Framework for Catalyst Discovery | AIChE

(4qf) Large Language Model and Multimodal Learning Framework for Catalyst Discovery

The application of machine learning (ML) in modeling catalytic systems has attracted significant attention due to its ability to efficiently replace computationally expensive quantum chemistry calculations. Graph Neural Networks (GNNs) are especially effective at predicting properties such as energy and forces, and can also be used to generate structures. However, there are ongoing challenges in constructing robust graph representations and fully utilizing domain knowledge for system-specific modeling. Graph representations are commonly used to model materials by capturing atomic connectivity. However, obtaining the accurate atomic coordinates necessary to build these graphs can be difficult. Additionally, while extensive knowledge exists regarding catalytic systems, effectively integrating this knowledge into models is challenging. To address these issues, we applied the Large Language Model (LLM) and multimodal learning to adsorbate-catalyst system modeling. First, we introduced a language model capable of predicting adsorption energy based solely on textual descriptions of adsorbate-catalyst systems. Next, by applying multimodal learning that aligns both graph and language representations, we achieved a 7.4~9.8% reduction in Mean Absolute Error (MAE) compared to a language-only approach for adsorption energy prediction. Additionally, we designed a framework for energy prediction that relies only on composition and surface orientation, bypassing the need for atomic coordinates. In this framework, a generative language model autoregressively creates Crystal Information Files (CIFs), which are then used as input for energy prediction. Finally, we developed an LLM agent that autonomously proposes system-specific candidate configurations for optimal adsorbate placement. Using GPT-4o, the LLM agent plans and reasons through potential adsorption configurations to identify those with the lowest energy. This approach significantly reduces the number of initial candidates while identifying lower adsorption energies. For instance, the LLM agent derived an adsorption energy of -1.367 eV from just four initial configurations for a nitrogen reduction reaction catalyst system, compared to the 59 configurations required by the conventional approach to find an energy of -0.954 eV. We demonstrate the potential of LLMs and multimodal learning in advancing catalyst discovery and design.

Research Interests: AI for Science, AI for Chemistry, Material Discovery