(112c) A Novel Variable Selection Method for Spectrum-Based Soft Sensor Development

Conference

AIChE Spring Meeting and Global Congress on Process Safety

Year

2019

Proceeding

2019 Spring Meeting and 15th Global Congress on Process Safety

Group

Computing and Systems Technology Division

Session

Process Modeling and Simulation

Time

Tuesday, April 2, 2019 - 2:20pm to 2:45pm

Authors

He, Q. P. - Presenter, Auburn University

Lee, J., Auburn University

Wang, J., Auburn University

A Novel Variable
Selection Method for Spectrum-based Soft Sensor Development

Jangwon Lee*, Jin
Wang*, Q. Peter He*⁺

*Department
of Chemical Engineering, Auburn University, Auburn, AL 36849 USA

(JL:
jzl0164@auburn.edu; JW: wang@auburn.edu; ⁺QPH: qhe@auburn.edu)

In the last few decades, spectroscopic techniques such as
near-infrared (NIR) spectroscopy have gained wide applications in oil and gas
industry. As a result, various soft sensors have been developed to predict
sample properties from its spectroscopic readings. Because the readings at
different wavelengths are highly correlated, it has been shown that variable
selection could significantly improve a soft sensor’s prediction performance
and reduce the model complexity. Existing variable selection methods focus on
selecting the variables (i.e., wavelengths or wavelength segments) that are
strongly correlated with the dependent variable to improve the prediction
performance. Although many successful applications have been reported, such
variable selection methods do have their limitations, such as wavelengths
selected do not have clear relationship with the chemical bounds or functional
groups presenting in the sample. As a result, these methods could face
robustness issue and their performances can be highly sensitive to the choice
of training data, and deteriorated performance when testing on new samples.

In this work, we present a novel variable selection method
that integrate the variable stability and variable importance in the projection
(VIP) score and transform them into the probability of variable importance. By incorporating
the random selection principle from the genetic algorithm (GA), variables are randomly
selected based on the variable importance probabilities, which prevents certain
wavelengths never get selected if deterministic variable importance criteria are
used. Using several case studies including gasoline and biodiesel, we compare
the performance of the proposed method to the existing variable selection
methods, including competitive adaptive reweighing sampling (CARS), variable importance
in the projection (VIP), genetic algorithm (GA), etc. We show that the proposed
method has several advantages: (1) significantly better performance in all case
studies; (2) identifies wavelength segments that are clearly related to the
chemical bounds and functional groups presenting in the sample; (3) better
robustness; (4) fewer parameters and much simpler tuning/training than GA.

Topics

Computing and Systems Engineering

Sensors

Other Sites & Tools

Technical Groups

Technical

Professional/Personal Growth

Societal Needs

Leadership

2025 Spring Meeting and 21st Global Congress on Process Safety

2025 AIChE Annual Meeting

Upcoming Conferences & Events

CEP: December 2024

CEP: November 2024

Explore Areas of Advancement:

Learning Center:

Want to be an Entrepreneur? Personal Stories From Three Successful Entrepreneurs Who Have Traveled This Path.