(278h) From Partial Data to out-of-Bounds Parameter and Observation Estimation with Diffusion Maps and Geometric Harmonics | AIChE

(278h) From Partial Data to out-of-Bounds Parameter and Observation Estimation with Diffusion Maps and Geometric Harmonics

Authors 

Koronaki, E. - Presenter, University of Luxembourg
Boudouvis, A. G., National Technical University of Athens (NTUA)
Evangelou, N., Johns Hopkins University
Psarellis, G., Johns Hopkins University
Dietrich, F., Technical University of Munich
Kevrekidis, I. G., Princeton University
A data-driven framework is presented, that enables the estimation of quantities, either observations or parameters, given enough partial data, in nonlinear applications. The illustrative example is that of vertical Chemical Vapor Deposition (CVD) reactor [1], commonly used in the industry for the deposition of thin films on solid substrates from gas reactants. Due to competing physical mechanisms in the reactor chamber and the nonlinearity of the problem, multiple states may be observed for the same set of parameter values. The goal is to predict observations, i.e. temperature and/or velocity distributions, for new parameter values even in the range of multiplicity. .

The proposed workflow applies data compression of the initial sample set with the Diffusion Maps algorithm [2,3] and uses special functions called Geometric Harmonics [4,5] in order to map between the reduced and the ambient space description. The “twist” here is that the Diffusion Maps, themselves, are treated as functions defined on the original, or “ambient” space and are extended with Geometric Harmonics to a larger set.

In work presented here, observations of the dimensionless temperature and concentration along the reactor length, are collected over a range of three physical parameters of the system where solution multiplicity is reported. The sample set, X∈RN×m, contains the distribution of temperature and conversion in the reactor together with the corresponding parameters, including multiple states for the same parameter values.

Given this sample, the Diffusion maps algorithm is implemented for dimensionality reduction by identifying a parametrization, Φ, of the low dimensional manifold that contains the data. The method is based on the construction of a Markov transition probability matrix corresponding to a random walk on a graph whose vertices are the data points with transition probabilities being the local similarities between pairs of data points. Carefully selected eigenvectors of the sparse Markov matrix (and not necessarily the first few) are then used to generate coordinates called diffusion maps (DMAPS).

In order to retrieve the variables in ambient space (parameters or observations) from known diffusion coordinates, each diffusion coordinate that corresponds to a known data point, is treated as the value of a function defined on the ambient space, X ∈ RN. A functional basis on the reduced space Φ is necessary in order to be able to interpolate the original coordinates as values of a function on Φ. This is achieved by implementing Diffusion Maps on the original Diffusion Maps coordinates determined during the dimensionality reduction step. This time, the first k eigenvectors are chosen in contrast to the selection process followed during dimensionality reduction. Given this functional basis, interpolation functions are derived using Geometric Harmonics extension. The number, k, of the basis functions, i.e. Diffusion coordinates, selected for interpolation is certainly larger than the number chosen for dimensionality reduction, but still they cannot be larger than a limit past which numerical instabilities of the algorithm may manifest.

Based on the idea above, this work demonstrates that the Diffusion maps of part the data can be implemented as a functional basis in Geometric Harmonics in order to establish functional relationships between the reduced space Φ and the specific part of the data. In this case, a mapping is established between the parameters and Φ or between the parameters and some observations (e.g. temperature at the reactor exit) and Φ. Armed with this functional relationship is possible to predict the diffusion coordinates that correspond to new parameter values and ultimately (after the implementation of the inverse map from Φ to X discussed in the previous paragraph) of the states in ambient space. In linear problems, or in nonlinear problems in the absence of multiplicity, the mapping between the parameter values and Φ can deliver very accurate predictions of the system state. In the region of multiplicity, the same concept can be applied provided that additional information is given as input: the set of parameter values and one (or a handful at the most) of “measurements” have to be used in order to apply Diffusion Maps for interpolation. The term “measurement” refers to an observation, either the result of a numerical or an actual experiment, e.g. the temperature and/or conversion at the reactor exit.

Overall, the proposed workflow, presents a mathematically robust and accurate framework to treat (reduce) and manipulate (correlate) data, both observations and parameters, in nonlinear applications where multiple states may exhibit for the same parameter values. It enables the prediction of system states, in the present example distributions of conversion and temperature, given only partial and limited information without posing restriction of the type of information.

[1] N. Cheimarios, E. D. Koronaki and A. G. Boudouvis "Illuminating nonlinear dependence of film deposition rate in a CVD reactor on operating conditions." Chemical Engineering Journal 181-182, 516 (2012).

[2] Coifman R.R., Lafon S. Diffusion maps. Appl. Comput. Harmon. Anal. (2006), 21 5–30.

[3] Coifman R.R., Lafon S., Lee A.B., Maggioni M., Nadler B., Warner F., Zucker S.W. Geometric diffusions as a tool for harmonic analysis and structure definition of data: Multiscale methods. Proc. Natl. Acad. Sci. USA (2005), 102 7432–7437.

[4] Coifman R. R., Lafon S., Geometric Harmonics: A novel tool for multiscale out-of-sample extension of empirical functions, Appl. Comput. Harmon. Anal. (2006), 21 31-52.

[5] Chiavazzo E., Gear C.W., Dsilva C.J., Rabin N., Kevrekidis I.G. Reduced Models in Chemical Kinetics via Nonlinear Data-Mining. Processes (2014), 2 112-140.