(362k) Accurate Surrogate Models for Stochastic Simulations Using Parin: Parameter As Input-Variable | AIChE

(362k) Accurate Surrogate Models for Stochastic Simulations Using Parin: Parameter As Input-Variable

Authors 

Mohammadi, S. - Presenter, Auburn University
Cremaschi, S., Auburn University
High-fidelity simulations have become widespread with the progress in our understanding of systems and the underlying phenomena accompanied by higher computational power to study, design, and optimize engineering systems in recent years (Eaton et al., 2017; Szilágyi et al., 2018). However, for applications like sensitivity analysis or optimization studies, which require a high number of simulation runs, the available computation power is not enough, in many cases (Liu et al., 2016; Peherstorfer et al., 2017). The construction of surrogate models, representing high-fidelity simulations, is one of the common ways to reduce the computational costs of these simulations (Han and Zhang, 2012). Surrogate models or meta-models approximate the relationship by mapping the input to output data from the simulation models (Quirante et al., 2015). Many different techniques have been successfully used for constructing surrogate models (e.g.,(Breiman, 2001; Friedman, 1991; Haleem et al., 2013; Williams and Rasmussen, 2006)).

Most existing surrogate models fail to accurately represent the outputs of high-fidelity stochastic simulations, e.g., simulations with uncertain parameters (Staum, 2009). In the current literature, the construction of surrogate models for stochastic simulations can be divided into two categories. In the first category, the uncertain parameter(s) are fixed at select values, which are a subset of realizations for the uncertain parameters, and a surrogate model is built for each realization of the uncertain parameters (Hüllen et al., 2019). In some cases, only one nominal value of the uncertain parameters is used. This approach causes uncertainty information loss because of the fixed parameter values. In the second category, stochastic kriging (SK) models are used as the surrogate modeling technique (Ankenman et al., 2008). Stochastic kriging has demonstrated promising results in predicting the expected output results (Ankenman et al., 2008), however, the surrogate model technique is fixed. Kriging is the only modeling technique used in SK, while a comparative study of different surrogate modeling techniques revealed that the selection of the best surrogate modeling technique depends on the input-output data characteristics (Williams and Cremaschi, 2021, 2019).

A new approach, PARIN (PARameter as Input-variable), is proposed in this work for building accurate surrogate models of stochastic simulations. PARIN converts the stochastic formulation of a simulation to a deterministic one by considering the uncertain parameter(s) as uncertain input(s) to the simulations, models the deterministic simulation outputs using any surrogate modeling techniques, and propagates the uncertainty of the parameters to the outputs using an efficient uncertainty propagation method (Mohammadi and Cremaschi, 2019) and the trained surrogate model, yielding an approximate distribution of the stochastic simulation output.

Figure 1 illustrates the proposed method, PARIN. Let Y=g(X;K) be a high-fidelity simulation model with X as the input vector with dimension d1, and K as the vector of system uncertain parameters with d2 components, where Y is the stochastic output value. Using PARIN, the new deterministic simulation model is defined as Y'=g'(X*), where X* is a d dimensional vector of inputs with d = d1 + d2, which contains the uncertain parameters of the stochastic simulation (g(X;K)) in addition to all simulation inputs. PARIN assumes known distribution for the uncertain parameters (K), where the distribution parameters are constant. With new definitions by PARIN, the stochastic simulation becomes a deterministic one (Y'=g'(X*)) providing the opportunity to implement any of the surrogate modeling techniques to train a model representing this simulation, g'(X*)≈F' (X*), where F'(X*) is the trained surrogate model of the deterministic simulation (Figure 1).

Six different machine learning techniques are used to build the surrogate models, F'(X*). A set of test functions from the Virtual Library of Simulation Experiments optimization test suite (Surjanovic and Bingham, 2013) is used to study and compare the performance of PARIN to three existing approaches. In the first approach, the uncertain parameter(s) are fixed at their nominal values, which is considered the base case. The second approach considers a subset of fixed values for the uncertain parameter(s), and the third one is stochastic kriging. Two case studies were used for the analysis to study 1) the impact of the number of input dimensions, and 2) the impact of the number of uncertain parameters. The quality of the estimates is evaluated based on two main metrics. The first metric is the normalized root mean square error (nRMSE) for predicting the output value of the stochastic simulation and the standard deviation associated with that. The second metric was Wasserstein distance to compare the predicted empirical distribution of the output to the true distribution from the stochastic simulation. The results showed that PARIN had the lowest nRMSE in predicting the mean and standard deviation of the test functions, and the Wasserstein distance calculated based on PARIN was the minimum in comparison to the other methods.

References

Ankenman, B., Nelson, B.L., Staum, J., 2008. Stochastic kriging for simulation metamodeling. Proc. - Winter Simul. Conf. 362–370. https://doi.org/10.1109/WSC.2008.4736089

Breiman, L., 2001. Random forests. Mach. Learn. 45, 5–32.

Eaton, A.N., Beal, L.D.R., Thorpe, S.D., Hubbell, C.B., Hedengren, J.D., Nybø, R., Aghito, M., 2017. Real time model identification using multi-fidelity models in managed pressure drilling. Comput. Chem. Eng. 97, 76–84. https://doi.org/10.1016/j.compchemeng.2016.11.008

Friedman, J.H. (stanford U., 1991. Multivariate adaptive regression splines.

Haleem, K., Gan, A., Lu, J., 2013. Using multivariate adaptive regression splines ( MARS ) to develop crash modification factors for urban freeway interchange influence areas. Accid. Anal. Prev. 55, 12–21. https://doi.org/10.1016/j.aap.2013.02.018

Han, Z.-H., Zhang, K.-S., 2012. Surrogate-based optimization. Real-world Appl. Genet. algorithms 343–362.

Hüllen, G., Zhai, J., Kim, S.H., Sinha, A., Realff, M.J., Boukouvala, F., 2019. Managing Uncertainty in Data-Driven Simulation-Based Optimization. Comput. Chem. Eng. 106519. https://doi.org/10.1016/j.compchemeng.2019.106519

Liu, B., Koziel, S., Zhang, Q., 2016. A multi-fidelity surrogate-model-assisted evolutionary algorithm for computationally expensive optimization problems. J. Comput. Sci. 12, 28–37. https://doi.org/10.1016/j.jocs.2015.11.004

Mohammadi, S., Cremaschi, S., 2019. Efficiency of Uncertainty Propagation Methods for Estimating Output Moments, in: Muñoz, S.G., Laird, C.D., Realff, M.J. (Eds.), Proceedings of the 9th International Conference on Foundations of Computer-Aided Process Design, Computer Aided Chemical Engineering. Elsevier, pp. 487–492. https://doi.org/https://doi.org/10.1016/B978-0-12-818597-1.50078-3

Peherstorfer, B., Kramer, B., Willcox, K., 2017. Combining multiple surrogate models to accelerate failure probability estimation with expensive high-fidelity models. J. Comput. Phys. 341, 61–75. https://doi.org/10.1016/j.jcp.2017.04.012

Quirante, N., Javaloyes, J., Ruiz-Femenia, R., Caballero, J.A., 2015. Optimization of chemical processes using surrogate models based on a kriging interpolation, in: Computer Aided Chemical Engineering. Elsevier, pp. 179–184.

Staum, J., 2009. Better Simulation Metamodeling: The why, what, and how of Stochastic Kriging 119–133.

Surjanovic, S., Bingham, D., 2013. Virtual Library of Simulation Experiments: Test Functions and Datasets.

Szilágyi, B., Agachi, P.Ş., Nagy, Z.K., 2018. Chord Length Distribution Based Modeling and Adaptive Model Predictive Control of Batch Crystallization Processes Using High Fidelity Full Population Balance Models. Ind. Eng. Chem. Res. 57, 3320–3332. https://doi.org/10.1021/acs.iecr.7b03964

Williams, B., Cremaschi, S., 2021. Selection of Surrogate Modeling Techniques for Surface Approximation and Surrogate-Based Optimization. Chem. Eng. Res. Des. https://doi.org/https://doi.org/10.1016/j.cherd.2021.03.028

Williams, B.A., Cremaschi, S., 2019. Surrogate Model Selection for Design Space Approximation And Surrogatebased Optimization, in: Computer Aided Chemical Engineering. Elsevier, pp. 353–358.

Williams, C.K.I., Rasmussen, C.E., 2006. Gaussian processes for machine learning. MIT press Cambridge, MA.