(305c) Extracting Meaningful Features from Industrial Text Data

Conference

AIChE Annual Meeting

Year

2023

Proceeding

2023 AIChE Annual Meeting

Group

Computing and Systems Technology Division

Session

Data science and analytics for process applications

Time

Thursday, November 9, 2023 - 8:40am to 9:00am

Authors

Castillo, I. - Presenter, The Dow Chemical Company

Strelet, E., University of Coimbra

Peng, Y., The Dow Chemical Co

Rendall, R., University of Coimbra

Chin, S. T., The Dow Chemical Company

Reis, M., University of Coimbra

In the Chemical Processing Industry (CPI), the available instrumentation may not capture all necessary information about the process, such as information regarding process health like leaks, corrosion, insulation degradation, and unplanned events. However, text data derived from reports, alarms, process tags, etc. can serve as diverse and informative sources of information for process analysis and monitoring. Appropriate handling of such data can provide supplementary insights for process diagnosis, monitoring, and control.

Recent advancements in Natural Language Processing (NLP) [1] have enabled the extraction of features from text data beyond the frequency counting of Bag of Words (BoW) [2] kind of approaches. NLP models can codify the meaning of text into numerical features, which can be used for further analysis. However, NLP models remain complex to understand and are still primarily used as black-box models. Moreover, the power and robustness of text feature extraction methods is still not explored in the CPI context. Therefore, we evaluated several text feature extraction methods, including Bag of Words (BoW) and NLP, using both unsupervised and supervised approaches [3] to assess their power and robustness.

We applied text data exploratory analysis to a real case study from Dow Chemical Company site to assess the information that can be extracted from industrial text data to predict the probability of an event occurrence. Our findings show that the context described in text data is relatively sparse, which may be related to the functional aggregation level reported in the texts. Overall, our study demonstrates the potential for text data to be used in process analysis and monitoring in CPI.

References

[1] D. Antons, E. GrÃ¼nwald, P. Cichy, T. O. Salge, e T. O. Salge, Â«The application of text mining methods in innovation research: current state, evolution patterns, and development prioritiesÂ», R & D Management, vol. 50, n.^o 3, pp. 329â€“351, jun. 2020, doi: 10.1111/radm.12408.

[2] A. Zheng e A. Casari, Feature Engineering for Machine Learning: Principles and Techniques for Data Scientists, 1 edition. Beijing : Boston: Oâ€™Reilly Media, 2018.

[3] T. Hastie, R. Tibshirani, e J. Friedman, The Elements of Statistical Learning: Data Mining, Inference, and Prediction, Second Edition, 2nd edition. New York, NY: Springer, 2009.

Topics

Process Automation & Control

Other Sites & Tools

Technical Groups

Technical

Professional/Personal Growth

Societal Needs

Leadership

2025 Spring Meeting and 21st Global Congress on Process Safety

2024 AIChE Annual Meeting

Upcoming Conferences & Events

CEP: September 2024

CEP: August 2024

Explore Areas of Advancement:

Learning Center:

Want to be an Entrepreneur? Personal Stories From Three Successful Entrepreneurs Who Have Traveled This Path.

(305c) Extracting Meaningful Features from Industrial Text Data

AIChE Annual Meeting

2023

2023 AIChE Annual Meeting

Computing and Systems Technology Division

Data science and analytics for process applications

Thursday, November 9, 2023 - 8:40am to 9:00am

Authors

Topics

More Conference Links

Visit Orlando

Universal Studios Offer

Cancellation Policy

Code of Conduct

Beware of Hotel and Attendee-list Scams