(59w) Automating the Discovery of Reaction Networks for Complex Reaction Systems from Spectroscopic Measurements | AIChE

(59w) Automating the Discovery of Reaction Networks for Complex Reaction Systems from Spectroscopic Measurements

Authors 

Srinivasan, K. - Presenter, University of Alberta, Edmonton, Canada
Prasad, V., University of Alberta
The chemical complexity of distributed feedstocks such as biomass pose a roadblock in the optimization and efficient monitoring of the different states involved in its upgradation process. The identification of the different chemical entities and the reactions they undergo is usually performed based on human expertise on the system and the reactions between the species is modelled through the use of model compounds. The inherent human bias associated with this approach makes the model restrictive and does not allow for on the fly updation of the reaction network based on new sensor measurements of the process. In this work, we present an automated reaction network hypothesis generation protocol that is system agnostic.

Spectroscopic measurements, viz, Fourier Transform Infrared (FTIR) spectroscopy and Proton Nuclear Magnetic Resonance (1H-NMR) spectroscopy measurements of the Hydrothermal Liquefaction (HTL) of biomass are used in this study. Spectroscopic sensors are frequently used in process industries in monitoring of the processes. A joint tensorial decomposition of the process data is employed to identify spectral signatures of the individual pesudo-components based on the Beer-Lambert’s law. The resultant spectral signatures encode information on the underlying species in the HTL process. The spectral signatures are deciphered using a 1-dimensional convolutional neural network (CNN). The CNN performs convolution operations on along the wavenumbers of the FTIR spectrum extracting positional information of peaks along with peak shape features to be used in identification of the functional groups. The identification problem is cast as a multi-label classification with binary cross entropy loss. In addition to classification, a spectral reconstruction task is jointly learnt by the CNN to provide a visual representation of the wavenumber regions captured by the classifier. The classifier predicts a vector of binary values indicating the presence or absence of a particular functional group. We use a Bayesian structure learning approach to situate the different PCs in the reaction network graph. The structure learning problem maximizes the posterior probability of occurrence of a child node conditioned on a parent node. The identified functional groups are considered as parts of a structural fingerprint (MACCS keys) with the other missing bits being set to zero. A fingerprint search of molecules available in chemistry databases is performed which generates candidate molecules for a PC. Reaction information available in literature is encoded as reaction templates using Atom-Atom Mapping. These templates provide a general rule describing the transformation of substrates to products. The network generation routine begins with the candidate molecule of the parent node on the learnt graph structure. Templates are applied on candidate molecule and an enumerative list of products is generated. The similarity between the products and the functional groups detected is performed to identify other candidate molecules. Similar generation steps are performed for the other candidate molecules with additional sanity checks to ensure that reversal of the reaction graph arc results in a precursor as identified through structure learning.

We test our methods on a synthetic dataset generated by imposing kinetics on a reaction network from literature. Results reveal a close fidelity between the network hypothesis from our methods and the ground truth. Reaction network hypothesis for the HTL process indicates glucosidic structures breaking down to furfural type molecules along with breakdown of poly-aromatic ligninic structures into aldehydes and acids.