(3gz) Machine Learning for Novel Molecule IR Spectrum Prediction | AIChE

(3gz) Machine Learning for Novel Molecule IR Spectrum Prediction

Authors 

McGill, C. J. - Presenter, North Carolina State University
Green, W., Massachusetts Institute of Technology
Research Interests

I am interested in the study and modeling of reaction systems using detailed kinetics, both in specific mechanisms and in representation of system-wide dynamics. It is my intention to study these systems computationally, paired with experimental data on the systems generated in collaboration with other research groups or publishd in the literature. I also intend to use machine learning tools for molecular predictions of chemical properties and reaction behavior.

The world of chemical manufacturing is filled with chemistries that are widely used, well studied, and yet not understood to the level of a detailed elementary mechanism. These reactions may already be used with great efficiency but their full potential cannot be assessed with a more full understanding of the elementary reactions at play. With this sort of improved understanding, it allows a practitioner to better tune reagents, and conditions. A full understanding of the mechanism could open up new pathways in analogous systems or provide solutions for unwanted byproducts. Recent computational studies have found such commonly relied on chemistries as the Grignard reaction and Fischer esterification cannot be reproduced computationally in their classically taught form and require an alternative mechanism in simulation to explain observed behavior. Similar opportunities for study may exist in all manner of industrial processes, newly attainable in recent decades with modern computational tools.

Many of the most interesting reaction systems comprise very large sets of reaction systems. Tools such as Reaction Mechanism Generator (RMG) and similar allow for procedural generation and study of these systems. In large systems there exist many areas for improvement which I intend to study. For instance, barrierless reactions in such systems are complex to represent in procedurally generated mechanisms and currently require significant directed attention to treat kinetically with accuracy. Barrierless reactions often form the entrance channels for subsets of the larger mechanism that behave as a group (such as pressure-dependent networks), in which the effective kinetics of the whole subset are particularly sensitive to the representation of the entrance channel. It is my intention to work on methods of calculating kinetics of barrierless reactions, with an emphasis on methods that can be performed procedurally in the generation of large mechanisms. Large mechanisms also contain subsets of reactions which are semi-equilibrated. I intend to study the dynamics of semi-equilibrated reaction sets with the goal of using system sensitivity on different time scales to simplify treatment of large mechanisms and reduce their computational load.

Machine learning is a powerful tool, being shown to be effective at solving otherwise intractible problems in many scientific domains. Recent work using machine learning in chemistry has yielded significantly promising results as well. Machine learning tools have been used to upgrade the results of lower accuracy quantum chemistry calculations, design artificial peptides, and even (as demonstrated by Google) predict the way a chemical will smell. In my own work, we have developed a model for the prediction of infrared absorbance spectra, using only the SMILES representation of a molecule as the input. I believe that this environment for property prediction and molecular continues to be ripe for development and offers continued opportunity for advancement. I intend to apply these sorts of models to prediction of properties like pKa and extend my current work on infrared spectra into other spectral domains. Further, I intend to look at ways of using machine learning tools to accelerate kinetics research through predictions of transition states and acceleration of complex kinetic calculations, such as the master equation approach for pressure-dependent networks.

Research Experience

Postdoctoral associate with William Green at the Massachusetts Institute of Technology. My research has involved both modeling system kinetics and using machine learning for molecular property prediction. The machine learning effort is part of a wide-scope Darpa-funded project for the design of novel chemical dyes. One major application has been the development of a model to predict IR spectra for novel molecules, using experimental data for model training supplemented with quantum chemical calculations for extensive model pretraining. In the course of this effort, I have coordinated the contributions of multiple team members to the project and collaborated with five partnered research groups. Apart from my research role in the group, I also served as team meeting coordinator, which entailed instituting best practices for remote work communications during the covid-19 isolation period.

Doctoral research advised by Phillip Westmoreland at North Carolina State University. My research at NC State included experiments using precision mass-spectrometry to characterize reactions paired with quantum chemistry calculations to build detailed kinetic mechanism models. I developed reaction mechanisms for a variety of systems: biomass pyrolysis, gas-phase organophosphorous species, and organotin catalysis of esterification. The computationally focused organotin project was a successful and well-rounded effort, where I led every stage of the project. Over the course of two years I developed a plan study an unknown mechanism for an important process, secured funding through a proposal to an industrial partner for collaborative study, surveyed mechanism options with model molecules, developed a detailed quantitative mechanism with predictions in agreement with the literature, mentored another graduate student contributing to the project, reported results to the sponsor regularly throughout the project, and wrote a manuscript for publication detailing the results.

Senior development engineer at Corning, Incorporated. I worked at the Wilmington, NC optical fiber fiber production facility (largest optical fiber plant in the US) in the Product and Process Development Department. In this role, I operated across the different process sectors of the plant to introduce new technologies and specialty products to regular production. My time there provided me with practical expertise applying the scientific method in an application-focused environment and an abundance of experience interfacing with project stakeholders and cross-disciplinary teams.

Teaching Interests

  • Controls - In my experience in industry, it was impressed upon me that most interactions between engineers and their chemical systems are mediated by controls technology, making a functional understanding of controls neceessary for applying engineering insight from other domains. I also see controls as one of the best courses available for giving students hands-on exposure to data science and the ever-present challenges of working with real data.

  • Reaction Engineering - Before taking reaction engineering, students will study the mathematical treatment of physical systems in transport phenomena as well as explicit study of the chemistry. Reaction engineering and its study of kinetics with associated impacts on system design is an important bridge between these two realms of study. It is a course that I am passionate about and one that relates to both my research experience and interests.

  • Material Balances - Underlying the complex analysis of systems that takes place in all classes in chemical engineering is a core set of principles. Many of these core principles are first introduced to students in the introductory material balances course. I believe that a good conceptual foundation laid down in the material balances course will help students develop an engineering frame of mind that will benefit them in their other courses and beyond.