(119e) Integrating Hybrid-Modeling with Sparse Regression - a Two-Stage Approach for Discovering Dynamics from Data | AIChE

(119e) Integrating Hybrid-Modeling with Sparse Regression - a Two-Stage Approach for Discovering Dynamics from Data

Authors 

Ravutla, S. - Presenter, Georgia Institute of Technology
Boukouvala, F., Georgia Institute of Technology
Process modeling involves formulating mathematical representations of physical processes, typically using ordinary and/or partial differential equations (ODEs, PDEs). These equations are crucial in enabling simulation, prediction, control, and diagnosis of system behavior [1, 2]. However, many dynamic systems remain unexplored, lacking comprehensive analytical descriptions. Exploiting advances in Machine learning (ML) approaches, recent developments focus on the integration of data-driven techniques into modeling system dynamics. Methodologies like SINDy [3] and a moving-horizon optimization-based approach [4] use sparse regression to select crucial terms from candidate libraries. Their effectiveness has been demonstrated across various applications, such as chemical reaction dynamics [5] and biological processes as well as control applications [6]. Nevertheless, open-challenges in model identification exist, especially in cases with sparse and noisy data, and complex nonlinear systems where it is impossible to know a-priori the comprehensive pool of candidate terms to select from [6].

A promising solution to these challenges is to adopt a hybrid paradigm that merges mechanistic and black-box models to capitalize on their individual strengths [1]. In this study, we integrate a hybrid modeling paradigm with sparse regression to develop and identify models simultaneously. Under this, we examine two approaches, considering varying complexities, data quality, and availability. In the first approach, we correct the missing physics of the model in the derivative space with neural ODE hybrid formulation [7]. We integrate SINDy-discovered models with neural ODE structures, to model unknown physics. In the second approach, we correct the missing physics of the model in the state space. For this, we employ Multifidelity Surrogate Models (MFSMs) [8] to construct composite models comprised of SINDy-discovered models and error-correction models. We test and compare our methods across different case studies that involve complex non-linear terms, such as the dynamics of a non-isothermal stirred tank reactor [9] and the penicillin bio-synthesis problem [7]. We will present comprehensive results on interpolation and extrapolation performance of the hybrid model structures, under varying availability of prior mechanistic knowledge and data quality and quantity.

Finally, leveraging the outcomes, we explore the potential of these hybrid models to lead to new knowledge and improved mechanistic expressions. We utilize techniques similar to evolutionary symbolic sparse regression [10] to discover the missing physics and to estimate model parameters. Overall, this analysis aims to explore hybrid modeling, and evolutionary programming approaches combined with sparse regression, for expedited model discovery.


References

  1. Bradley, W., et al., Perspectives on the integration between first-principles and data-driven modeling. Computers & Chemical Engineering, 2022. 166: p. 107898.
  2. van de Berg, D., et al., Data-driven optimization for process systems engineering applications. Chemical Engineering Science, 2022. 248: p. 117135-117135.
  3. Brunton, S., J. Proctor, and N. Kutz. Sparse identification of nonlinear dynamics (sindy).
  4. Lejarza, F. and M. Baldea, Data-driven discovery of the governing equations of dynamical systems via moving horizon optimization. Scientific Reports, 2022. 12(1): p. 11836.
  5. Hoffmann, M., C. Fröhner, and F. Noé, Reactive SINDy: Discovering governing reactions from concentration data. The Journal of chemical physics, 2019. 150(2): p. 025101-025101.
  6. Abdullah, F. and P.D. Christofides, Data-based modeling and control of nonlinear process systems using sparse identification: An overview of recent results. Computers & Chemical Engineering, 2023: p. 108247.
  7. Bradley, W. and F. Boukouvala, Two-Stage Approach to Parameter Estimation of Differential Equations Using Neural ODEs. Industrial & Engineering Chemistry Research, 2021. 60(45): p. 16330-16344.
  8. Ravutla, S., J. Zhai, and F. Boukouvala, Hybrid Modeling and Multi-Fidelity Approaches for Data-Driven Branch-and-Bound Optimization, in Computer Aided Chemical Engineering. 2023, Elsevier. p. 1313-1318.
  9. Lejarza, F. and M. Baldea, Discovering governing equations via moving horizon learning: The case of reacting systems. AIChE Journal, 2022. 68(6): p. e17567.
  10. Askari, E. and G. Crevecoeur, Evolutionary sparse data-driven discovery of complex multibody system dynamics. arXiv preprint arXiv:2210.11656, 2022.