(578h) Enhancing Computational Speed in the Discopde Symbolic Regression Framework Using Time-Efficient Integration Methods | AIChE

(578h) Enhancing Computational Speed in the Discopde Symbolic Regression Framework Using Time-Efficient Integration Methods

Authors 

Hillman, I. - Presenter, University of Connecticut
Cohen, B., University of Connecticut
Bollas, G., University of Connecticut
Partial differential equations (PDEs) are crucial for the optimal design and control of chemical systems, as they describe variations of processes in space and time. Current data-driven machine learning (ML) techniques such as neural networks are utilized to discover PDEs. These techniques are limited to training data, require known inputs, and lack interpretability. In our recent work [1] we presented a framework that discovers human interpretable PDEs from scarce and noisy data by performing symbolic regression (SR) via genetic programming (GP). When data collection is costly or difficult, our framework can successfully find ground truth models with few time series points. However, the original framework encounters challenges with computational speed. A decrease in computational overhead could enhance real-time decision-making and effective resource allocation. This study aims to reduce the integration time to improve overall computational speed in the framework while still recovering accurate, complete models.

An analysis of the computational speed of the framework was conducted, which revealed the integration step has the greatest impact. We hypothesize that the integrator error is small in comparison to the inherent noise in the data, which follows a Gaussian distribution; thus, introducing a faster integrator with more integration error would not bias the Gaussian distribution, allowing the model to remain accurate. First, we conducted quantile-quantile (QQ) and Shapiro-Wilk analyses to understand if the error of the model caused by the numerical integrator follows a Gaussian distribution. Next, we input the results from integration into the parameter estimation step of the framework, which yielded an incomplete expression. Then, we iteratively evaluated the fitness of each incomplete expression to discover a complete model, which was ultimately compared to the ground truth.

A fused explicit and implicit Euler integrator implemented by Liquid Time-constant Networks[2] was among the fixed-step, variable-step, implicit, and explicit integrators tested to solve one-dimensional, single component, isothermal dynamic plug flow reactor (PFR) models. The mean-square error (MSE) of the integrator was computed to measure the average error from integration and the integration speed was recorded. Then, the model discovered by the framework was compared to the ground truth model to determine the fastest integrator that retains solution fidelity. This study was extended to PFR design equations with axial dispersion, where boundary-value problems (BVPs) were studied.

References

[1] Cohen, B.; Beykal, B.; Bollas, G. M. Physics-Informed Genetic Programming for Discovery of Partial Differential Equations from Scarce and Noisy Data. 2023. https://doi.org/10.2139/ssrn.4604759.

[2] Hasani, R.; Lechner, M.; Amini, A.; Rus, D.; Grosu, R. Liquid Time-Constant Networks. Proc. AAAI Conf. Artif. Intell. 2021, 35 (9), 7657–7666. https://doi.org/10.1609/aaai.v35i9.16936.

Topics