(59ae) Efficient Hybrid Modeling and Sorption Model Discovery for Non-Linear Advection-Diffusion-Sorption Systems: A Systematic Scientific Machine Learning Approach | AIChE

(59ae) Efficient Hybrid Modeling and Sorption Model Discovery for Non-Linear Advection-Diffusion-Sorption Systems: A Systematic Scientific Machine Learning Approach

Authors 

Rebello, C. - Presenter, Norwegian University of Science and Technology
Viena Santana, V., Norwegian University of Science and Technology
Costa, E., Norwegian University of Science and Technology
Ribeiro, A. M., LSRE - Laboratory of Separation and Reaction Engineering - Associate Laboratory LSRE/LCM
Rackauckas, C., Massachusetts Institute of Technology
Nogueira, I., LA / LSRE - LCM
Mathematical modeling is a crucial aspect of science and engineering, with two main approaches in chemical engineering: mechanistic/classical and empirical [1]. The former relies on conservation equations, transport, and thermodynamic expressions, while the latter is based on observations without assumptions about the underlying physics. Hybrid models, which merge empirical equations and physics knowledge, have gained increasing attention in engineering [1]–[5]

Hybrid serial models in chemical engineering may combine first-principles derived equations with a universal approximator, like Artificial Neural Networks (ANNs), to replace one or more terms of the equation resulting in an object named Universal Differential Equation [6]. Applications of hybrid models have mainly focused on bioreaction engineering, with ANNs replacing kinetic reaction laws [7], [8]. However, the computational challenge of fitting ANNs' parameters in a differential equation has limited their application in more complex systems such as the ones described by partial differential equations (PDEs) [7], [9], [10]. Some works have pioneered the application of UDEs in advection-diffusion-sorption PDEs [9]–[11].

Gradient-based or Hessian-based optimizers are preferred for training neural networks [12], but calculating accurate and efficient gradients remains an open issue in UDEs for advection-diffusion-sorption PDEs. Automatic Differentiation (AD) can provide high-accuracy gradients but can be computationally demanding and numerically unstable for non-linear advection-diffusion-sorption PDE problems [13]. This work demonstrates a feasible and efficient way for training hybrid non-linear advection-diffusion-sorption PDE problems using gradient-based optimizers. The present work also addresses the interpretability of ANNs' predictions by using symbolic or sparse regression on ANNs' output [6]. This approach reduces computational burden and allows for finding a combination of simple functions that share similar properties with the trained ANN. The in-silico dataset used in this work simulates breakthrough curves of a hypothetical single-component non-linear advection-diffusion-sorption system with various isotherms and kinetic models [14].

Overall, this work aims to help engineers efficiently introduce hybrid modeling in packed-bed separation for improving predictive power or discovering mass-transfer kinetics directly from breakthrough curves data. The methodology includes the in-silico dataset building process, hybrid model proposition, numerical aspects of the hybrid model solution and gradient calculation, and sparse regression details.

The ANN architecture is typically chosen before training, with hyperparameter optimization used in traditional deep learning to select the best architecture for a problem [6], [15]. In this work, one-layer or two-layer ANNs with hyperbolic tangent activation were used, and a varying number of neurons between 15 and 25 were chosen using grid search. Learning rates were set to 0.05 with ADAM optimizer [12] and exponential learning rate decay every 20 iterations and a 0.985 drop factor over 180 iterations. After the second fit with ADAM, the BFGS method [16] was employed until convergence.

Numerical solution of the hybrid PDE model and gradient calculation are crucial as stability and speed of gradient calculation are influenced by discretization and numerical integration method. In fixed-bed chromatography literature, Orthogonal Collocation on Finite Elements (OCFEM) is preferred [17] due to its high accuracy with steep gradients in concentration profiles. OCFEM is used with cubic Hermite polynomials [18] and zeros of shifted orthogonal Legendre polynomials as collocation points. A total of 42 evenly spaced finite elements were used, and the resulting ODE was solved using a fixed-leading coefficient adaptive-order adaptive-time BDF method (FBDF) implemented in DifferentialEquations.jl [19].

Gradient calculation for the hybrid model is formulated using an L2-norm-based discrete cost function. Continuous or discrete adjoint sensitivity analysis is preferred for calculating gradients when the sum of parameters and equations is large. The quadrature adjoint with JIT-compiled reverse-mode AD vector-jacobian product with FBDF solver is used for solving the adjoint problem in this work, as it was the most performant and stable method [13]

In summary, the presented method employs a one-layer or two-layer ANNs with hyperbolic tangent activation, grid search to select the number of neurons, ADAM optimizer, and BFGS method for training. OCFEM is used for discretization, and the adjoint sensitivity analysis is chosen for gradient calculation using the quadrature adjoint with JIT-compiled reverse-mode AD vector-jacobian product and FBDF solver. This approach provides a stable and efficient way to train hybrid models in non-linear advection-diffusion-sorption PDE systems.

The results show that the UDE approach fits breakthrough training data well, with errors compatible with simulated noise and no apparent auto-correlation in time. It performs well in the test set for desorption and adsorption from another steady state, and the uptake rate is close to the training noiseless ground-truth data. Similar conclusions can be drawn for other models. The hybrid approach also fits breakthrough training data well, with good performance in the test set and uptake rate close to noiseless ground-truth data. However, in some cases, the uptake rate is underestimated despite the good fitting of breakthrough data.

Sparse and symbolic regression techniques were used to obtain polynomials for Langmuir, Sips, and Vermeulen's isotherms with LDF and improved LDF kinetics. The obtained polynomials resemble the true kinetic models that produced the data, and the train and test set predictions were very close to the noisy observations. Symbolic regression did not require tuning the sparsity parameter and produced simpler expressions with fewer terms. A Taylor expansion analysis demonstrated why polynomials with positive integer exponents found in sparse and symbolic regression explain the observations well.

[1] J. Sansana et al., ‘Recent trends on hybrid modeling for Industry 4.0’, Computers and Chemical Engineering, vol. 151. Elsevier Ltd, Aug. 01, 2021. doi: 10.1016/j.compchemeng.2021.107365.

[2] L. von Rueden, S. Mayer, R. Sifa, C. Bauckhage, and J. Garcke, ‘Combining Machine Learning and Simulation to a Hybrid Modelling Approach: Current and Future Directions’, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 12080 LNCS, pp. 548–560, 2020, doi: 10.1007/978-3-030-44584-3_43.

[3] I. Pan, L. R. Mason, and O. K. Matar, ‘Data-centric Engineering: integrating simulation, machine learning and statistics. Challenges and opportunities’, Chem Eng Sci, vol. 249, p. 117271, 2022, doi: 10.1016/j.ces.2021.117271.

[4] M. von Stosch, R. Oliveira, J. Peres, and S. Feyo de Azevedo, ‘Hybrid semi-parametric modeling in process systems engineering: Past, present and future’, Comput Chem Eng, vol. 60, pp. 86–101, 2014, doi: 10.1016/j.compchemeng.2013.08.008.

[5] M. Raissi, P. Perdikaris, and G. E. Karniadakis, ‘Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations’, J Comput Phys, vol. 378, pp. 686–707, 2019, doi: 10.1016/j.jcp.2018.10.045.

[6] C. Rackauckas et al., ‘Universal Differential Equations for Scientific Machine Learning’, pp. 1–55, Jan. 2020, [Online]. Available: http://arxiv.org/abs/2001.04385

[7] S. Feyo De Azevedo, B. Dahm, and F. R. Oliveira, ‘Hybrid modelling of biochemical processes: A comparison with the conventional approach’, Comput Chem Eng, vol. 21, no. SUPPL.1, 1997, doi: 10.1016/s0098-1354(97)87593-x.

[8] H. J. Zander, R. Dittmeyer, and J. Wagenliuber, ‘Dynamic modeling of chemical reaction systems with neural networks and hybrid models’, Chem Eng Technol, vol. 22, no. 7, 1999, doi: 10.1002/(SICI)1521-4125(199907)22:7<571::AID-CEAT571>3.0.CO;2-5.

[9] H. Narayanan, M. Luna, M. Sokolov, P. Arosio, A. Butté, and M. Morbidelli, ‘Hybrid Models Based on Machine Learning and an Increasing Degree of Process Knowledge: Application to Capture Chromatographic Step’, Ind Eng Chem Res, p. acs.iecr.1c01317, Jul. 2021, doi: 10.1021/acs.iecr.1c01317.

[10] H. Narayanan, T. Seidler, M. F. Luna, M. Sokolov, M. Morbidelli, and A. Butté, ‘Hybrid Models for the simulation and prediction of chromatographic processes for protein capture’, J Chromatogr A, vol. 1650, p. 462248, 2021, doi: 10.1016/j.chroma.2021.462248.

[11] T. Praditia, M. Karlbauer, S. Otte, S. Oladyshkin, M. V. Butz, and W. Nowak, ‘Finite Volume Neural Network: Modeling Subsurface Contaminant Transport’, Apr. 2021, [Online]. Available: http://arxiv.org/abs/2104.06010

[12] D. P. Kingma and J. L. Ba, ‘Adam: A method for stochastic optimization’, in 3rd International Conference on Learning Representations, ICLR 2015 - Conference Track Proceedings, 2015.

[13] Y. Ma, V. Dixit, M. J. Innes, X. Guo, and C. Rackauckas, ‘A Comparison of Automatic Differentiation and Continuous Sensitivity Analysis for Derivatives of Differential Equation Solutions’, 2021 IEEE High Performance Extreme Computing Conference, HPEC 2021, no. 2, 2021, doi: 10.1109/HPEC49654.2021.9622796.

[14] Z. Li et al., ‘A numerical modelling study of SO2 adsorption on activated carbons with new rate equations’, Chemical Engineering Journal, vol. 353, no. July, pp. 858–866, 2018, doi: 10.1016/j.cej.2018.07.119.

[15] S. Kim, W. Ji, S. Deng, Y. Ma, and C. Rackauckas, ‘Stiff neural ordinary differential equations’, Chaos, vol. 31, no. 9, 2021, doi: 10.1063/5.0060697.

[16] J. Nocedal and S. J. Wright, Numerical Optimization Second Edition. 2006.

[17] B. A. Finlayson, ‘ORTHOGONAL COLLOCATION ON FINITE ELEMENTS’, Chem Eng Sci, vol. 30, no. 1, pp. 587–596, 1974, doi: 10.1016/0378-4754(80)90097-X.

[18] I. A. Ganaie, S. Arora, and V. K. Kukreja, ‘Cubic Hermite Collocation Method for Solving Boundary Value Problems with Dirichlet, Neumann, and Robin Conditions’, International Journal of Engineering Mathematics, vol. 2014, pp. 1–8, Feb. 2014, doi: 10.1155/2014/365209.

[19] C. Rackauckas, M. Innes, Y. Ma, J. Bettencourt, L. White, and V. Dixit, ‘DiffEqFlux.jl - A Julia Library for Neural Differential Equations’, no. February, 2019, [Online]. Available: http://arxiv.org/abs/1902.02376