(416e) A Framework for the Statistical Value of Extreme Points for Function Approximation
AIChE Annual Meeting
2022
2022 Annual Meeting
Topical Conference: Applications of Data Science to Molecules and Materials
Innovations in Methods of Data Science
Tuesday, November 15, 2022 - 4:35pm to 4:51pm
With the recent data deluge in various science and engineering disciplines, there is renewed interest in data-driven techniques for function approximation. A necessary component of such a problem is the availability of data that is representative of the input space such that it can be used to reasonably approximate the true function. The statistical framework of experimental design allows one to estimate the relative value of data-points, subject to optimization of a user-defined objective function. Such problems requiring optimal design points frequently arise in kinetics of complex reaction systems, developmental toxicology, and thermodynamic calculations of activity coefficients for mixtures â among other applications. We argue that not all data-points are equally beneficial, nor are they all necessary. We posit that there are data regimes in the functionâs domain which provide more information than others. Using model-based experimental design, the case in favor of extreme points is presented, along with their statistical merits. We demonstrate that this results in the lowest value of parameter uncertainty. Additionally, we formulate an optimization problem that corroborates our preference of extreme points of a function in comparison to any other data-points in the input domain. The methodology can be extended to higher dimensions, and we present how such an approach can be utilized by an end-user to reduce time required for performing experiments.