(84bi) Increasing Data Collection Efficiency through Incorporation of Derivative and Uncertainty Information into Gaussian Process Regression

Conference

AIChE Annual Meeting

Year

2023

Proceeding

Wednesday, November 8, 2023 - 3:30pm to 5:00pm

Molecular simulations generate vast amounts of information, including molecular-level details that, when appropriately averaged, provide estimates of structural, thermodynamic, and transport properties of materials and fluids. Determining the behavior of these properties as a function of state conditions or other adjustable simulation parameters, such as those related to the force field, requires many simulations across state/parameter space. While this is typically a costly endeavor, Gaussian Process Regression (GPR) techniques have recently shown promise as robust models for predicting property behavior over a given state/parameter space and efficiently directing new data collection based on active learning. We show that this automated data collection process can benefit significantly from standard uncertainty estimates, as well as derivative information. It is best practice to include the former in any simulation results, while the latter is often-neglected but can be straight-forwardly obtain from statistical mechanical relations. As an example, we highlight the collection of equation of state data, where relationships between derivative information and property fluctuations are familiar. In this case, both simulations and experiments produce derivative information that can be beneficially incorporated into GPR models. However, we reveal that it is important to assign uncertainty estimates to this information, as certainty is expected to change drastically and systematically with derivative order (e.g., uncertainties in heat capacities may be much higher than those in average energies given a fixed amount of simulation time). We provide an extensible code-base incorporating derivative information and uncertainties into GPR models and demonstrate how this can be used to more efficiently collect both simulation and experimental data. Finally, we will demonstrate how active learning routines can be viewed as improved versions of common data collection algorithms, presenting, as an example, an adaptive, uncertainty-controlled version of Gibbs-Duhem integration.

Other Sites & Tools

Technical Groups

Technical

Professional/Personal Growth

Societal Needs

Leadership

2025 Spring Meeting and 21st Global Congress on Process Safety

2025 AIChE Annual Meeting

Upcoming Conferences & Events

CEP: December 2024

CEP: November 2024

Explore Areas of Advancement:

Learning Center:

Want to be an Entrepreneur? Personal Stories From Three Successful Entrepreneurs Who Have Traveled This Path.

(84bi) Increasing Data Collection Efficiency through Incorporation of Derivative and Uncertainty Information into Gaussian Process Regression

AIChE Annual Meeting

2023

2023 AIChE Annual Meeting

Poster Sessions

General Poster Session

Wednesday, November 8, 2023 - 3:30pm to 5:00pm

More Conference Links

Visit Orlando

Universal Studios Offer

Cancellation Policy

Code of Conduct

Beware of Hotel and Attendee-list Scams