Separation and Extraction of DNA from DNA-Protein Mixtures Microfluidic Device | AIChE

Separation and Extraction of DNA from DNA-Protein Mixtures Microfluidic Device

Authors 

Ladd, A. J. C. - Presenter, University of Florida
Design of experiments (sampling) and curve fitting are essential to building surrogate functions that accurately predict the output of a complex simulation or experiment with high-dimensional input (design) space. In many cases, collecting samples is expensive and the amount of data that can be collected is small, requiring that surrogate model development uses as few sampled data points as possible; however, when the input space contains a high number of variables, dense sampling using, for instance, a tensor product grid, runs into the curse of dimensionality, since the number of samples needed grows exponentially with number of dimensions. Smolyak sparse grids, in contrast, select only the points assumed to contribute most significantly to the curve fit. The numerical values of these points, as well as the basis functions that comprise the surrogate function’s terms, depend on the choice of orthogonal basis functions and unidimensional grid. Further improving the Smolyak algorithm has long been an interest in the field, one proposed method allows for self-adapting Smolyak sparse grids that prioritize adding samples that are expected to have a substantial effect on the accuracy. Self-adaptive schemes use an error estimate to predict which points, in a dimensional or regional manner, would have the greatest effect reducing the error of the surrogate function. Slow exponential growth schemes, which limit the rate at which new points can be added after each iteration, have also been proposed. In this work, we describe the implementation and evaluation of a new Python package, smolyay, for surrogate modeling using Smolyak sparse grids that includes both adaptive and slow growth strategies for efficiency improvement. The Python package has a modular and object-oriented design, allowing the choice of basis function, unidimensional grid, and optimizations to be changed and new ones to be added. Here we present a comparison between different strategies for building Smolyak surrogate models currently available in the smolyay package, as well as compare them to other conventional surrogate modeling methods, such as Gaussian process models. The smolyay package has potential for a wide variety of applications, allowing for the creation of simple models with a similar predictive power as higher fidelity and more detailed models. The error estimate of the self-adaptive strategies could be informative to experimentalists who want to know what conditions will give them the most insightful results given the data they already have.