(529d) Predictive Parametrization of PC-SAFT EoS By Neural Network Ensembles Based on Molecular Fingerprints
AIChE Annual Meeting
2023
2023 AIChE Annual Meeting
Fuels and Petrochemicals Division
Properties and Phase Equilibria for Fuels and Petrochemicals
Tuesday, November 7, 2023 - 9:00am to 9:20am
Simulation of chemical or biotechnological processes plays a key role in the design or optimization of industrial processes or production plants. Especially in the design of modern biofuels, physical properties of a variety of different hydrocarbons (ethers, oxygenated hydrocarbons, fatty acid methyl esters) are needed, as the development of new feedstocks for fuel production are a significant challenge of our time. For this purpose, the phase behavior can be efficiently described using advanced equations of state (EoS), such as PC-SAFT. These equations of state (as most thermodynamic models) require a parametrization (pure-component parameters) of each component. This is typically performed by fitting the respective parameters to physical pure-component properties (e.g. vapor pressure, saturated liquid density). However, such experimental data is often not available, especially if complex molecules such as pharmaceuticals are considered. Thus, parametrization using non-experimental inputs are of high interest.
In this work, a neural network (NN)-ensemble was developed to predict PC-SAFT pure-component parameters for non-associating molecules. Beside basic molecular input features (e.g. molar mass, number of rotatable bonds), extended-connectivity fingerprints (ECFPs) were used as key input feature to characterize the molecules. All input features were directly derived from easy available SMILES-notation of the molecules. A dataset comprising ~300 molecules was used for training of the NN-esemble. To increase the statistical validity, 5-fold cross validation with three random initial splits was performed creating a NN-ensemble of 15 NNs to predict the three PC-SAFT pure-component parameters for non-associating molecules. The ensemble prediction yielded good accuracy (AARD < 8 % between ML-predicted PC-SAFT parameters and literature PC-SAFT parameters) for the considered test dataset. Moreover, ML-predicted PC-SAFT pure-component parameters for unknown molecules (not included in training) showed a remarkable performance in describing the experimental (validation) data.
This novel ML-approach offers a reliable and easy access to PC-SAFT pure-component parameters of any non-associating molecule solely based on the chemical structure formulated as SMILES string. Prospectively, ML-predicted pure-component parameters can be used in an early state of process design to estimate phase behavior using a physics-based EoS without additional experimental effort.
In this work, a neural network (NN)-ensemble was developed to predict PC-SAFT pure-component parameters for non-associating molecules. Beside basic molecular input features (e.g. molar mass, number of rotatable bonds), extended-connectivity fingerprints (ECFPs) were used as key input feature to characterize the molecules. All input features were directly derived from easy available SMILES-notation of the molecules. A dataset comprising ~300 molecules was used for training of the NN-esemble. To increase the statistical validity, 5-fold cross validation with three random initial splits was performed creating a NN-ensemble of 15 NNs to predict the three PC-SAFT pure-component parameters for non-associating molecules. The ensemble prediction yielded good accuracy (AARD < 8 % between ML-predicted PC-SAFT parameters and literature PC-SAFT parameters) for the considered test dataset. Moreover, ML-predicted PC-SAFT pure-component parameters for unknown molecules (not included in training) showed a remarkable performance in describing the experimental (validation) data.
This novel ML-approach offers a reliable and easy access to PC-SAFT pure-component parameters of any non-associating molecule solely based on the chemical structure formulated as SMILES string. Prospectively, ML-predicted pure-component parameters can be used in an early state of process design to estimate phase behavior using a physics-based EoS without additional experimental effort.