(12d) Thermodynamics-Informed Graph Neural Networks for Predicting Molecular and Mixture Properties | AIChE

(12d) Thermodynamics-Informed Graph Neural Networks for Predicting Molecular and Mixture Properties

Authors 

Rittig, J. G. - Presenter, RWTH Aachen University
Lapkin, A. A., University of Cambridge
Mitsos, A., RWTH Aachen University
Machine learning (ML) for the prediction of physicochemical properties of molecules and their mixtures has recently become a very active research field. A variety of ML approaches, such as graph neural networks (GNNs), matrix completion methods (MCMs), and transformers, have been investigated for predicting properties that are relevant to chemical engineering, e.g., solvation free energies [1] and activity coefficients [2-6]. These ML models have shown superior accuracy compared to well-established thermodynamic models like UNIFAC or COSMO-RS, cf. [2-4]. However, the training of the ML models typically considers only the prediction loss, i.e., the deviation between predicted property values and provided simulation or experimental data, while thermodynamic principles are neglected. This can lead to thermodynamic inconsistencies in model predictions and requires large data sets for training, making hybrid approaches that incorporate thermodynamics desirable.

Hybrid approaches allow the exploitation of thermodynamic knowledge in molecular ML models. For predicting molecular and mixture properties with ML, hybrid models have mainly focused on embedding thermodynamic equations into the ML model architecture, cf. reviews in [7,8]. However, embedding equations from thermodynamic models also introduces corresponding modeling assumptions and predictive limitations into the ML model. Therefore, we consider an alternative hybrid approach: physics-informed ML [9], that is, adding physical equations in the form of a regularization term to the loss function when training an ML model. This concept has gained much attraction for solving partial differential equations (PDE) with physics-informed neural networks (PINNs), which use automatic differentiation to add gradient information of the neural network to the loss function accounting for the PDE [10]. Recently, physics-informed ML has also been applied to material and molecular property prediction [11-13], e.g., by utilizing differential relations of thermodynamic properties to the Helmholtz free energy. Since such thermodynamics-informed approaches are based on fundamental thermodynamic equations, they do not impose modeling assumptions and preserve the predictive flexibility of ML models. However, applications to the prediction of thermodynamic mixture properties of a diverse set of molecules are missing so far.

We propose thermodynamics-informed GNNs for predicting properties of molecules and mixtures. Specifically, we demonstrate our approach on the example of activity coefficients of binary mixtures [6]. Here, we utilize the Gibbs-Duhem equation which provides a differential relationship between the activity coefficients of binary mixtures with changing composition. Hence, the Gibbs-Duhem consistency can be evaluated via automatic differentiation of the GNN. The proposed thermodynamics-informed GNNs use the Gibbs-Duhem equation as a regularization term in addition to the prediction loss during model training, thereby learning thermodynamic principles. We train the GNNs on a large data set of activity coefficients with 40,000 different binary mixtures at 7 different compositions that was generated by Qin et al. [5]. The results show that thermodynamics-informed GNNs achieve significantly increased Gibbs-Duhem consistency with high accuracy for binary activity coefficient predictions compared to state-of-the-art GNN models trained only on the prediction loss. The thermodynamics-informed GNNs further show higher generalization capabilities to compositions absent from the training data. Our approach allows for easy incorporation of additional thermodynamic knowledge, making it promising for application to other molecular and mixture properties. The models and code are available as open source [6].

References

[1] Vermeire, F. H., & Green, W. H. (2021). Transfer learning for solvation free energies: From quantum chemistry to experiments. Chemical Engineering Journal, 418, 129307.

[2] Medina, E. I. S., Linke, S., Stoll, M., & Sundmacher, K. (2022). Graph neural networks for the prediction of infinite dilution activity coefficients. Digital Discovery, 1(3), 216-225.

[3] Jirasek, F., Alves, R. A., Damay, J., Vandermeulen, R. A., Bamler, R., Bortz, M., Mandt, S., Kloft, M. & Hasse, H. (2020). Machine learning in thermodynamics: Prediction of activity coefficients by matrix completion. The Journal of Physical Chemistry Letters, 11(3), 981-985.

[4] Winter, B., Winter, C., Esper, T., Schilling, J., & Bardow, A. (2023). SPT-NRTL: A physics-guided machine learning model to predict thermodynamically consistent activity coefficients. Fluid Phase Equilibria, 568, 113731.

[5] Qin, S., Jiang, S., Li, J., Balaprakash, P., Van Lehn, R. C., & Zavala, V. M. (2023). Capturing molecular interactions in graph neural networks: A case study in multi-component phase equilibrium. Digital Discovery, 2(1), 138-151.

[6] Rittig, J. G., Felton, K. C., Lapkin, A. A., & Mitsos, A. (2023). Gibbs–Duhem-informed neural networks for binary activity coefficient prediction. Digital Discovery, 2(6), 1752-1767.

[7] Carranza-Abaid, A., Svendsen, H. F., & Jakobsen, J. P. (2023). Thermodynamically consistent vapor-liquid equilibrium modelling with artificial neural networks. Fluid Phase Equilibria, 564, 113597.

[8] Jirasek, F., & Hasse, H. (2023). Combining machine learning with physical knowledge in thermodynamic modeling of fluid mixtures. Annual Review of Chemical and Biomolecular Engineering, 14, 31-51.

[9] Karniadakis, G. E., Kevrekidis, I. G., Lu, L., Perdikaris, P., Wang, S., & Yang, L. (2021). Physics-informed machine learning. Nature Reviews Physics, 3(6), 422-440.

[10] Raissi, M., Perdikaris, P., & Karniadakis, G. E. (2019). Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics, 378, 686-707.

[11] Masi, F., Stefanou, I., Vannucci, P., & Maffi-Berthier, V. (2021). Thermodynamics-based artificial neural networks for constitutive modeling. Journal of the Mechanics and Physics of Solids, 147, 104277.

[12] Rosenberger, D., Barros, K., Germann, T. C., & Lubbers, N. (2022). Machine learning of consistent thermodynamic models using automatic differentiation. Physical Review E, 105(4), 045301.

[13] Hernandez, Q., Badias, A., Chinesta, F., & Cueto, E. (2022). Thermodynamics-informed graph neural networks. arXiv preprint arXiv:2203.01874.