(494d) Development of a Machine Learning Algorithm to Predict Diverse System Solubility | AIChE

(494d) Development of a Machine Learning Algorithm to Predict Diverse System Solubility

Authors 

Sen, S., North Carolina State University
Hughes-Oliver, J., North Carolina State University
Baynes, R., North Carolina State University
Predicting solubility in diverse solute-solvent systems is important for a plethora of industrial applications that rely on solvation and crystallization of organic and inorganic compounds. Since solubility is strongly dependent on molecular interactions in the liquid phase, it is difficult to estimate. Standard activity coefficient models such as NRTL, UNIQUAC and UNIFAC, may produce incorrect estimates of solubility in complex systems, especially at extreme temperatures and pressures. Several empirical machine learning models have also been developed, which show success in some classes of systems (such as aqueous solutions), but have limited applicability.

Our work focuses on developing a broadly applicable predictive machine learning model to predict solubility for a large space of possible solute/solvent combinations, with the ability to make reasonable predictions at temperatures and pressures far from ambient conditions. While we look at small solute molecules, we study the polymorphic and crystal lattice structures of solutes in the mined literature to make sure we represent a wide chemical space. Since most experimental data available is near standard ambient temperature and pressure, we need to include SLE (solid liquid equilibrium) data at high temperatures and pressures, motivated by several industrial processes. Furthermore, at high temperatures and pressures, solubility is usually nonmonotonic. To incorporate this in our model, we develop a new thermodynamically motivated set of mixing rules. To ensure solvent diversity, we look at both polar and nonpolar inorganic solvents. We consider both single and binary solvent mixtures, cataloging different primary and secondary solvent concentrations in mixtures.

This method has shown the capability of modeling solubility for both sparingly soluble and highly soluble solutes. We have applied the model to a variety of solvents at extreme temperatures to test its applicability with promising results.