(11i) Data-Driven Explainable Classification for Economic Bioproduct Separation | AIChE

(11i) Data-Driven Explainable Classification for Economic Bioproduct Separation

Authors 

Van Lehn, R., University of Wisconsin-Madison
Maravelias, C., Princeton University
Lignocellulosic biomass is a renewable feedstock to produce platform chemicals and biofuels [1, 2]. The microbial conversion of biomass often produces low-concentration bioproducts in the aqueous phase [3]. Bioproduct separation to achieve purity and recovery requirements can contribute to more than 70% of the total production costs [4], motivating the investigation of improved separation approaches. Liquid-liquid extraction is a low-energy separation technology which is based on the preferential partitioning of bioproducts into a selected solvent system from water. To reduce solvent consumption, liquid-liquid extraction is often coupled with other separations (e.g., distillation) to purify and recycle solvents. This results in a hybrid separation consisting of liquid-liquid extraction and distillation. Given information on component relative volatilities, liquid-liquid equilibrium constants, bioproduct feed composition and solvent price, research questions exist regarding 1) when hybrid separation is feasible; 2) when hybrid separation is more favorable than individual distillation operation; and 3) how system properties influence separation designs.

In this work, we propose a data-driven classification framework to identify system property domains when hybrid separation is more economically feasible than distillation. Thermodynamic properties on relative volatilities, and liquid-liquid equilibrium constants are prepared for 747 common solvents and 38 bioproducts using a property prediction framework based on molecular dynamics (MD) simulations, conformer sampling, and COSMO-RS (COnductor-like Screening Model for Real Solvents) [5]. Within this domain of thermodynamics properties, we sample practical thermodynamic parameters along with the bioproduct feed composition and solvent price. These parameters are input to a hybrid separation model with stage-by-stage extraction [6] and short-cut distillation [7] and to an individual short-cut distillation model. Under common parameters, these models are solved to global optimality and the minimum separation costs are compared for hybrid separation and distillation. The feasible cost comparison gives the separation decision of hybrid separation and distillation. To facilitate separation decision making within enormous solvent and bioproduct space, we further connect the input parameters with output separation decision through random forest classifiers for fast separation decision prediction. We compute Shapley values to interpret input parameters’ influence on separation decisions and identify critical interactions among parameters that have a major impact on separation decisions [8]. We show that solvent price and water-bioproduct relative volatility are key contributing parameters for selection between hybrid separation and distillation. We discuss the variation of separation structure with liquid-liquid equilibrium constant of solvent and relative volatilities. Furthermore, we demonstrate the rapid solvent screening for bioproduct separation using the trained classifiers.

[1] Baral, N.R., Sundstrom, E.R., Das, L., Gladden, J., Eudes, A., Mortimer, J.C., Singer, S.W., Mukhopadhyay, A. and Scown, C.D. Approaches for more efficient biological conversion of lignocellulosic feedstocks to biofuels and bioproducts. ACS sustainable chemistry & engineering, 7(10), 9062-9079, 2019.

[2] Kim, J., Sen, S.M. and Maravelias, C.T. An optimization-based assessment framework for biomass-to-fuel conversion strategies. Energy & Environmental Science, 6(4), 1093-1104, 2013.

[3] Luterbacher, J., Alonso, D. M., Dumesic, J. Targeted chemical upgrading of lignocellulosic biomass to platform molecules. Green Chemistry, 16 (12), 4816-4838, 2014.

[4] Yenkie, K.M., Wu, W., Clark, R.L., Pfleger, B.F., Root, T.W. and Maravelias, C.T., 2016. A roadmap for the synthesis of separation networks for the recovery of bio-based chemicals: matching biological and process feasibility. Biotechnology Advances, 34(8), 1362-1383, 2016.

[5] Li, J., Maravelias, C.T., Van Lehn, R.C. Adaptive conformer sampling for property prediction using the COnductor-like Screening Model for Real Solvents. Submitted.

[6] Taifan, G. S., Maravelias, C. T. Integration of graphical approaches into optimization-based design of multistage liquid extraction. Computers & Chemical Engineering, 143, 107126, 2020.

[7] Ryu, J. and Maravelias, C.T. Computationally efficient optimization models for preliminary distillation column design and separation energy targeting. Computers & Chemical Engineering, 143, 107072, 2020

[8] Lundberg, S.M., Erion, G., Chen, H., DeGrave, A., Prutkin, J.M., Nair, B., Katz, R., Himmelfarb, J., Bansal, N. and Lee, S.I. From local explanations to global understanding with explainable AI for trees. Nature machine intelligence, 2(1), 56-67, 2020