(743g) Machine Learning Applications for Geologic Data Integration and Operational Data Analysis in Geologic Carbon Dioxide Storage Systems

Conference

AIChE Annual Meeting

Year

2021

Proceeding

Engineering Geologic Carbon Dioxide Storage Systems

Time

Thursday, November 18, 2021 - 9:30am to 9:45am

Authors

Mishra, S. - Presenter, Battelle Memorial Institute

Hill, B., Battelle Memorial Institute

Haagsma, A., Battelle Memorial Institute

Gupta, N., Battelle

Problem:

Data-driven models that are built using machine learning (ML) algorithms are becoming increasingly common-place in subsurface science and engineering applications. The impetus for adopting this emerging technology comes from its success in multiple fields such as consumer marketing, finance, design and manufacturing, health care, etc. The use of ML is particularly well-suited for characterizing, describing, and forecasting the behavior of geologic carbon dioxide storage systems where typical data analysis challenges include: (a) Incomplete data, (b) unreliable physics-based models (if they exist), and (c) data-driven models using conventional statistical methods are not robust.

In this presentation, we will describe the application of ML for two specific problems: (1) geologic data integration, i.e., identification and prediction of electrofacies from well-log data, and (2) operational data analysis, i.e., prediction of bottomhole pressure and temperature in CO2 injection wells from injection rate and wellhead pressure and temperature measurements.

Methods:

Our systematic ML workflow for building data-driven models involves the following steps: (a) exploratory data analysis to visually understand patterns, trends and outliers in the multivariate datasets, (b) statistical imputation to fill-in missing values (if any), (c) unsupervised learning to identify natural groupings (statistically homogeneous subsets) across the space of independent variables (predictors), and (d) supervised learning to fit predictive models between known predictors and responses (dependent variables). Unsupervised learning is typically carried out using principal component analysis (PCA), k-means clustering (kMC), hierarchical clustering (HC), etc. Supervised learning can be formulated either as a classification problem where the response is categorical, or a regression problem where the response is continuous. This is typically carried out using algorithms such as k-nearest neighbors (kNN), random forest (RF), artificial neural network (ANN), etc.

Results:

For the geologic data integration problem in a CO2 enhanced oil recovery project, well-logs from 250+ oil wells in the Albion-Scipio field in Southern Michigan were collected in a database. The database was checked for outlier values which were either rectified or eliminated. Missing values for several well logs were then imputed using a Random Forest algorithm to create a full database. Cluster analysis using both kMC and HC were used to identify the presence of 6 natural groups (or electrofacies) within the dataset. Finally, highly accurate models to predict the electrofacies based on well-log attributes were built and validated using both traditional statistical approaches (i.e., logistic regression) and machine learning (i.e., RF) approaches, as shown below in Figure â€“ Part A.

For the bottom-hole pressure and temperature prediction problem, data from 3 different CO2 injection wells in different pinnacle reefs in Northern Michigan were collected. The dataset includes hourly values for wellhead pressure, wellhead temperature, wellhead density, injection rate, bottomhole pressure and bottomhole temperature. For all three wells, examination of the data revealed a bifurcation of the data around a wellhead density of 25 lb/ft³. Therefore, different models were built for high- and low-density subsets of the data corresponding to this threshold. As in the previous case, missing values were filled in (imputed) using a random forest regression approach. Separate predictive models were built for bottom hole pressure and temperature as a function of the surface conditions for each well. The baseline model was a multivariate linear regression model with quadratic and cross terms. Machine learning options included a kNN model, an RF model, and an ANN model. The models were validated using three replicates of randomized 80-20 split sample testing (i.e., 80% training and 20% test data). The machine learning models were generally more successful, as demonstrated from the performance on the held-out data for one of the wells in the figure below in Figure â€“ Part B.

Implications:

A systematic workflow for machine learning applications has been demonstrated for two representative subsurface problems. The identification of electrofacies helps build robust predictive models between well-log attributes and dynamic reservoir properties such as permeability. Also, bottom-hole gauges are not present in all wells (or they tend to malfunction from time to time), so the ability to predict pressure and temperature downhole from surface measurements is a valuable capability. Successful application of machine learning for these two different types of problems (one for geologic reservoir characterization side and one for operational data analysis) demonstrates the added value from these workflows. We are currently investigating the transferability of such models to other wells in the vicinity, as well as across geologic basins.

KEYWORDS: geologic storage; carbon dioxide; machine learning; classification; regression

Topics

Carbon Capture & Storage

Other Sites & Tools

Technical Groups

Technical

Professional/Personal Growth

Societal Needs

Leadership

2024 Annual Safety in Ammonia Plants and Related Facilities Symposium

4th Optogenetic Technologies and Applications Conference

Upcoming Conferences & Events

2024 Dow Sponsored CCPS Process Safety Faculty Workshop

2024 India Student Regional Conference

CCPS India Regional Meeting

CCPS Process Safety Knowledge Webinar (Brazil)

2024 Indonesia Student Regional Conference

Procesa 2024: 6th AIChE Latin America Student Regional Conference

CCPS Workshop on Process Safety Metrics: API-RP-754 Implementation

University of Houston Student Process Safety Bootcamp

2024 Annual Safety in Ammonia Plants and Related Facilities Symposium

CEP: July 2024

CEP: June 2024

Explore Areas of Advancement:

Learning Center:

Want to be an Entrepreneur? Personal Stories From Three Successful Entrepreneurs Who Have Traveled This Path.

(743g) Machine Learning Applications for Geologic Data Integration and Operational Data Analysis in Geologic Carbon Dioxide Storage Systems

AIChE Annual Meeting

2021

2021 Annual Meeting

Sustainable Engineering Forum

Engineering Geologic Carbon Dioxide Storage Systems

Thursday, November 18, 2021 - 9:30am to 9:45am

Authors

Topics

More Conference Links

Cancellation Policy

Code of Conduct

Beware of Hotel and Attendee-list Scams