(725b) Prediction of Thermal Decomposition Hazards for Pure Compounds: A Machine Learning Approach Utilizing Molecular Structure
AIChE Annual Meeting
2024
2024 AIChE Annual Meeting
Pharmaceutical Discovery, Development and Manufacturing Forum
Predictive Scale-Up/Scale-Down for Pharmaceuticals
Thursday, October 31, 2024 - 3:48pm to 4:06pm
Despite challenges such as data limitations and complex reaction mechanisms, ML algorithms offer a transformative solution for predicting thermal hazards and safety-related properties with reasonable accuracy. Several models have been developed to predict thermal decomposition parameters for various classes of compounds using their structural information. For instance, Fayet et al. (2011)3 developed a QSPR model by integrating Multiple Linear Regression (MLR), Partial Least Squares (PLS), and decision tree models to predict heat of decomposition for nitroaromatic compounds. Zhang and Xu (2020)4 employed a Gaussian process regression model to predict the decomposition onset temperature of lubricant additives. He et al. (2021)5 utilized a genetic algorithm and multiple linear regression model to predict the thermal decomposition temperature of binary Imidazolium ionic liquid mixtures. This literature provides insights into the feasibility of predicting thermal decomposition parameters for specific compound classes. These models help to understand the stability of similar compounds, aiding in identifying potential thermal hazards involved in the processes.
The methodology proposed for developing the model for predicting the onset temperature using the molecular structure involves collecting the data, clustering appropriately, and developing a model using various descriptors. The onset temperature prediction model developed in this work uses the data available in the literature. By limiting the dataset to pure compounds, a maximum of up to two rings, and the presence of reactive functional groups such as amines, alcohols, nitro, and azo groups, approximately 1000 data points were selected. Molecular representations in the form of a Simplified Molecular Input Line Entry system (SMILES) and the physicochemical and topological descriptors were extracted. A generalized model may not yield accurate results and cause underfitting because structural changes significantly affect the onset temperature. Therefore, clustering the data based on thermal behavior and structural similarity was performed. The cluster with sufficient data, namely nitro compounds, was chosen for model development. The most suitable descriptors were identified using various feature selection methods and incorporated into the modeling. A multiple linear regression model was developed using the selected descriptors. The developed model undergoes testing and validation, involving multiple iterations to fine-tune model configurations, ensuring its robustness and enhancing its predictive capabilities. The developed model is aimed to attain optimal accuracy with the mean absolute error (MAE) of less than 20.
This model offers insights into thermal onset temperature at an early stage without experimentation or computationally expensive DFT simulations. Prediction of onset temperature suggests the criticality of the process by comparing the onset temperature with operating conditions. This enhances safety analysis by identifying hidden thermal hazards early in the process, reducing resource wastage, and optimizing scale-up potential.
Keywords: Thermal decomposition, Onset temperature, molecular descriptors, predictive modeling
References
(1) T2 Laboratories Inc. Reactive Chemical Explosion | CSB. https://www.csb.go12v/t2-laboratories-inc-reactive-chemical-explosion/ (accessed 2024-04-04).
(2) BP Amoco Thermal Decomposition Incident | CSB. https://www.csb.gov/bp-amoco-thermal-decomposition-incident/ (accessed 2024-04-04).
(3) Fayet, G.; Del Rio, A.; Rotureau, P.; Joubert, L.; Adamo, C. Predicting the Thermal Stability of Nitroaromatic Compounds Using Chemoinformatic Tools. Molecular Informatics 2011, 30 (6â7), 623â634. https://doi.org/10.1002/minf.201000077.
(4) Zhang, Y.; Xu, X. Machine Learning Decomposition Onset Temperature of Lubricant Additives. J. of Materi Eng and Perform 2020, 29 (10), 6605â6616. https://doi.org/10.1007/s11665-020-05146-5.
(5) He, H.; Pan, Y.; Meng, J.; Li, Y.; Zhong, J.; Duan, W.; Jiang, J. Predicting Thermal Decomposition Temperature of Binary Imidazolium Ionic Liquid Mixtures from Molecular Structures. ACS Omega 2021, 6 (20), 13116â13123. https://doi.org/10.1021/acsomega.1c00846.