(463f) Efficient Surrogate Model Generation with Adaptive Sequential Sampling

Conference

AIChE Annual Meeting

Year

2012

Proceeding

Computing and Systems Technology Division

Session

Wednesday, October 31, 2012 - 2:20pm to 2:42pm

Authors

Eason, J. P. - Presenter, University of Tulsa

By constructing a simpler approximation of the full system, surrogate-based optimization allows for traditional derivative-based optimization techniques to be applied in more complex problems. However, the construction of a surrogate model requires the execution of the original model many times in order to gather the data that will be used to construct it. Depending on the complexity of the original model, this step could become cost prohibitive. For example, if the original model is a computational fluid dynamics simulation of a packed two-phase reactor, single simulation run might take anywhere from days to weeks to solve. Therefore, it is important to determine the number and location of the samples to minimize the upfront computational costs to generate accurate surrogate models. In this work, three adaptive sequential sampling algorithms are developed for surrogate model construction. Their performances are compared on the basis of number of samples needed to generate surrogate models with 5% accuracy for three challenge functions: the Shekel function, Ackley function, and Beale function. In this work, artificial neural networks are used as surrogate models.

All three algorithms are sequential design techniques, i.e., they begin with an initial sample set, train the networks, evaluate the performance of the trained networks (we used K-fold cross-validation as the model evaluation technique), select n new data points where n is a given fraction of the current sample size, and train the networks again with the new full data set. This procedure is repeated until a stopping criterion is satisfied.

The first algorithm begins the process of selecting new points by generating a large Latin hypercube sample of proposed points. The set of neural networks, each trained by the data of a single fold, is used to predict the output variable at each proposed point. The variance of the predictions at a given point provides an estimate of the surrogate-model variance at that point. The points with the highest variance estimates are where the models have the highest uncertain predictions and hence these points are selected to run the original model. The neural networks are then trained with the overall data set. This process is repeated until the maximum variance estimate decreases below a target value. The second algorithm is similar in its execution steps; however, it uses a performance metric combining the estimated variance and the distance to the nearest-neighbor sample point to select the sequential sample points rather than the variance alone. This algorithm terminates when the performance metric is below a target value for all proposed points. The addition of the nearest-neighbor distance makes this algorithm both space-filling and adaptive, to both locate and fully model fluctuations in the objective function. The final algorithm uses incremental Latin hypercube sampling (or iLHS, as presented in Nuchitprasittichai and Cremaschi 2012) until the mean squared error stabilizes, and then switches to the first algorithm to add more samples in areas of insufficient information.

Algorithm 1 is the simplest and fastest algorithm, but requires that the initial sample size be sufficient to locate all regions of interest in the objective function. The space-filling criterion in algorithm 2 overcomes this problem but at an additional computation cost to calculate the nearest neighbor distance. Algorithm 3 uses iLHS to create sufficiently space-filling samples before beginning the adaptive sampling, but the iLHS process is costly because the algorithm maintains a Latin hypercube sample for each iteration, eliminating previously sampled points if necessary. Although many past sampling algorithms were designed for a specific type of surrogate model, all three algorithms presented here generalize to any type of surrogate model. These algorithms also scale well to problems with higher dimensionality as are common in chemical engineering.

See more of this Session: Advances In Optimization

See more of this Group/Topical: Computing and Systems Technology Division

Other Sites & Tools

Technical Groups

Technical

Professional/Personal Growth

Societal Needs

Leadership

2024 Annual Safety in Ammonia Plants and Related Facilities Symposium

4th Optogenetic Technologies and Applications Conference

Upcoming Conferences & Events

CCPS India Regional Meeting

CCPS Process Safety Knowledge Webinar (Brazil)

2024 Indonesia Student Regional Conference

Procesa 2024: 6th AIChE Latin America Student Regional Conference

CCPS Workshop on Process Safety Metrics: API-RP-754 Implementation

University of Houston Student Process Safety Bootcamp

2024 Annual Safety in Ammonia Plants and Related Facilities Symposium

9th CCPS Canadian Regional Meeting

4th Optogenetic Technologies and Applications Conference

CEP: August 2024

CEP: July 2024

Explore Areas of Advancement:

Learning Center:

Want to be an Entrepreneur? Personal Stories From Three Successful Entrepreneurs Who Have Traveled This Path.

(463f) Efficient Surrogate Model Generation with Adaptive Sequential Sampling

AIChE Annual Meeting

2012

2012 AIChE Annual Meeting

Computing and Systems Technology Division

Advances In Optimization

Wednesday, October 31, 2012 - 2:20pm to 2:42pm

Authors

More Conference Links

Cancelation Policy

Code of Conduct

Beware of Hotel and Attendee-list Scams

Code of Conduct

Beware of Hotel and Attendee-list Scams