(370f) Quantitative Evaluation of Grouping Complex Substances and Sorbent Material Design: Delivering Data-Informed Insights
AIChE Annual Meeting
2019
2019 AIChE Annual Meeting
Computing and Systems Technology Division
Interactive Session: Data and Information Systems
Tuesday, November 12, 2019 - 3:30pm to 5:00pm
In this work, we design a framework to (i) optimally group complex chemical substances based on their chemical characteristics in order to facilitate decision-making by read-across [3], and (ii) predict the sorption activity of broad-acting materials via regression techniques for different chemical groups [4]. First, we exploit hierarchical clustering methodology using Pearson correlation as similarity metric, and build classification models using Random Forest algorithm for optimal grouping. We have used the analytical chemistry data of 60 Standard Reference Materials (SRMs) provided by the National Institute of Standards and Technology (NIST) [5], and 15 complex chemical substances of Unknown or Variable composition, Complex reaction products, and Biological materials (UVCBs), where Gas Chromatography â Mass Spectrometry (GC-MS), two-dimensional gas chromatography with flame ionization detector (GCxGC-FID), and Ion Mobility â Mass Spectrometry (IM-MS) techniques [6] are adopted. Dimensionality reduction techniques are incorporated to select the most informative features in order to further improve the grouping results, which are quantified by the Fowlkes-Mallows (FM) index [7], and classification accuracy. On the other hand, the selection of the optimal sorption material for a given chemical mixture is a challenging and iterative task, where the chemical-sorbent property space needs to be explored iteratively to fine-tune and guide the experimental designs. Therefore, we perform predictive modeling of sorption activity of materials via advanced regression techniques [8-9]. Our results demonstrate that modeling and data-driven optimization analysis immensely facilitates the communication of complex substance groupings for the read-across, and thus the decision-making in designing solutions for the community during environmental emergency-related contamination events.
References
[1] TAMU Superfund Research Center (2019). https://superfund.tamu.edu/
[2] Schultz, T. W., Amcoff, P., Berggren, E., Gautier, F., Klaric, M., Knight, D. J., ... & Cronin, M. T. D. (2015). A strategy for structuring and reporting a read-across prediction of toxicity. Regulatory Toxicology and Pharmacology, 72(3), 586-601.
[3] Onel, M., Beykal, B., Ferguson, K., Chiu, W. A., McDonald, T.J., Zhou, L., House, J. S., Wright, F. A., Sheen, D. A., Rusyn, I., Pistikopoulos, E. N. (2019). Quantitative Evaluation of Grouping Complex Substances using Analytical Chemistry Data: Delivering Data-Informed Insights. (In preparation).
[4] Onel, M., Beykal, B., Wang, M., Grimm, F. A., Zhou, L., Wright, F. A., Phillips, T. A., Rusyn, & Pistikopoulos, E. N. (2018). Optimal Chemical Grouping and Sorbent Material Design by Data Analysis, Modeling and Dimensionality Reduction Techniques. Computer Aided Chemical Engineering, Elsevier, 43, 421-426.
[5] de Carvalho Rocha, W. F., Schantz, M. M., Sheen, D. A., Chu, P. M., & Lippa, K. A. (2017). Unsupervised classification of petroleum Certified Reference Materials and other fuels by chemometric analysis of gas chromatography-mass spectrometry data. Fuel, 197, 248-258.
[6] Grimm, F. A., Russell, W. K., Luo, Y. S., Iwata, Y., Chiu, W. A., Roy, T., ... & Rusyn, I. (2017). Grouping of petroleum substances as example UVCBs by ion mobility-mass spectrometry to enable chemical composition-based read-across. Environmental Science & Technology, 51(12), 7197-7207.
[7] Fowlkes, E. B., & Mallows, C. L. (1983). A method for comparing two hierarchical clusterings. Journal of the American Statistical Association, 78(383), 553-569.
[8] Boukouvala, F., & Floudas, C.A.. (2017). ARGONAUT: AlgoRithms for Global Optimization of coNstrAined grey-box compUTational problems. Optimization Letters, 11(5), 895-913.
[9] Beykal, B., Boukouvala, F., Floudas, C. A., Sorek, N., Zalavadia, H., & Gildin, E. (2018). Global optimization of grey-box computational systems using surrogate functions and application to highly constrained oil-field operations. Computers & Chemical Engineering, 114, 99-110.