(127c) Knowledge Discovery and Explanation from Industrial Process Data Using Clustering and Subspace Search

Conference

AIChE Spring Meeting and Global Congress on Process Safety

Year

2017

Proceeding

2017 Spring Meeting and 13th Global Congress on Process Safety

Group

3rd Big Data Analytics

Session

Big Data Analytics and Smart Manufacturing II

Time

Tuesday, March 28, 2017 - 4:30pm to 5:00pm

Authors

Zhu, W. - Presenter, Chemical Engineering Department, Louisiana State U

Romagnoli, J., Louisiana State University

Data driven methods for process data analysis have received considerable attention in past years due to the accessibility of real time process data. Compared with traditional model based methods, data driven methods was superior in its robustness to different chemical processes. Thus, many studies were developed based on supervised learning and multivariate statistics for process monitoring. Nevertheless, both above approaches required well classified historical data to describe process behaviors. In this regard a technique that is able to extract process behaviors correctly is required before the implementation of any data driven methods. To build up this bridge, we discuss a clustering based approach to extract process behaviors from historical data. Moreover, a novel approach to find the explanation between pair-wise disparate process clusters by subspace search is proposed to reveal more hidden knowledge behind historical data.

In this study, Density-based spatial clustering of applications with noise (DBSCAN) and k-means clustering are introduced for process behavior extraction from historical database. Both clustering techniques are studied on industrial data of a pyrolysis reactor data and simulation data. Their performances are evaluated by three cluster evaluation metrics (homogeneity, completeness and Daviesâ€“Bouldin index).

Beyond the process behavior extraction using data clustering techniques, we propose a subspace searching based approach to explain the disparity between pair-wise process clusters in terms of the most contributing attributes. In other words, the most contributing attributes are used to explain the disparity between certain process clusters (comparative group) with its reference clusters (reference group). Each data sample in comparative group is compared with the reference group by its dimensional normalized k-distance in each subspaces (or called â€˜dimensionsâ€™). The subspace with highest dimensional normalized k-distance is treated as the explanation of the disparity. Nevertheless, the brute force searching is computational infeasible due to its computational complexity. Thus, sample condensation and greedy searching are used to handle the computational complexity in our study.

The results illustrate that both DBSCAN and k-means clustering performs well on classification of process behaviors. Various process modes and process faults are recognized by such clustering techniques. Furthermore, pair-wise explanation of disparate process clusters seems reasonable by reviewing the variation of attributes in the explanatory subspace. The utilization of sample condensation and greedy searching optimizes the computational complexity, which enables such approach both suitable for online fault identification and offline data analysis.

Topics

Computing and Systems Engineering

Checkout

This paper has an Extended Abstract file available; you must purchase the conference proceedings to access it.

Checkout

Do you already own this?

Pricing

Individuals

AIChE Pro Members	$150.00
AIChE Graduate Student Members	Free
AIChE Undergraduate Student Members	Free
AIChE Explorer Members	$225.00
Non-Members	$225.00

Other Sites & Tools

Technical Groups

Technical

Professional/Personal Growth

Societal Needs

Leadership

2025 Spring Meeting and 21st Global Congress on Process Safety

2025 AIChE Annual Meeting

Upcoming Conferences & Events

CEP: December 2024

CEP: November 2024

Explore Areas of Advancement:

Learning Center:

Want to be an Entrepreneur? Personal Stories From Three Successful Entrepreneurs Who Have Traveled This Path.

(127c) Knowledge Discovery and Explanation from Industrial Process Data Using Clustering and Subspace Search

AIChE Spring Meeting and Global Congress on Process Safety

2017

2017 Spring Meeting and 13th Global Congress on Process Safety

3rd Big Data Analytics

Big Data Analytics and Smart Manufacturing II

Tuesday, March 28, 2017 - 4:30pm to 5:00pm

Authors

Topics

Checkout

Do you already own this?

Pricing

Individuals

More Conference Links

Cancelation Policy

Code of Conduct

Beware of Hotel and Attendee-list Scams