(219h) Characterizing Complex Solvent Environments in Acid-Catalyzed Reactions Using Molecular Dynamics Simulations and 3D Convolutional Neural Nets

Conference

AIChE Annual Meeting

Year

2020

Proceeding

2020 Virtual AIChE Annual Meeting

Group

Topical Conference: Applications of Data Science to Molecules and Materials

Session

Applications of Data Science in Catalysis and Reaction Engineering III

Time

Tuesday, November 17, 2020 - 9:15am to 9:30am

Authors

Jiang, S. - Presenter, University of Wisconsin-Madison

Chew, A. K., University of Wisconsin

Zhang, W., University of Wisconsin-Madison

Van Lehn, R., University of Wisconsin-Madison

Zavala, V. M., University of Wisconsin-Madison

The catalytic conversion of lignocellulosic biomass is a promising strategy to obtain transportation fuels and high-value chemicals from renewable feedstocks [1]. The conversion of biomass-derived molecules is typically facilitated by liquid-phase, acid-catalyzed reactions that are hindered by low reactivity in aqueous solution. One method to increase acid-catalyzed reaction rates is to modify the solvent composition by mixing organic, polar aprotic cosolvents with water to create mixed-solvent environments [2, 3]. Compared with trial-and-error experimentation, computational tools have been applied to understand solvent effects on chemical reactivity and guide the design of solvent mixtures for efficient and cheap biomass conversion processes [4]. Molecular dynamics (MD) simulations can be utilized to understand and predict solvent effects on experimental reaction rates for the conversion of biomass-derived model compounds in aqueous mixtures of 1,4-dioxane (DIO), g-valerolactone (GVL), and tetrahydrofuran (THF) [3]. We developed an MD model consisting of only reactant, water, and cosolvent molecules and calculated three simulation-derived descriptors for a linear regression model to predict experimental reaction rates and found good agreement in DIO-water mixtures [3]. The regression model was less accurate for GVL- and THF-water mixtures, indicating that either descriptor computed with classical MD cannot quantify reaction rates in these systems or that more complex descriptors must be defined to capture reactivity trends. However, designing new descriptors of reaction kinetics based on human intuition is challenging, often requiring complex and time-consuming data analysis tools (e.g. solvation free energies [6] or three-dimensional solvent mapping [5]) that cannot be readily generalized across a range of solvent compositions.

As an alternative to designing descriptors via human intuition, machine learning methods have been increasingly used to infer molecular properties by automatically extracting features from complex sources of data [7-13]. For example, convolutional neural networks (CNNs) can be used to identify and quantify patterns within two-dimensional (2D) spatial datasets such as images [14]. By training on a suitable set of labeled image data, CNNs extract spatial features without requiring human supervision and can then utilize these features to classify image contents. CNNs can be further generalized to extract features from three-dimensional (3D) volumetric data [15], which can facilitate the analysis of 3D molecular structures. For example, 3D CNNs have recently been used to detect protein functional sites [16], evaluate protein-ligand binding sites [17], and quantify protein-ligand binding affinities [18] by training on protein database structures. Based on these examples and our prior success using classical MD simulations to predict acid-catalyzed reaction outcomes [3], we hypothesize that 3D CNNs can exploit the output of classical MD simulations to more accurately predict solvent effects on acid-catalyzed reaction rates.

In this work, we developed 3D CNNs that utilize atomic positions obtained from classical MD simulation trajectories to predict the rates of liquid-phase, acid-catalyzed biomass conversion reactions in mixed-solvent environments. We constructed 3D grids of voxels (the 3D analogs of 2D pixels) that represent atomistic positions sampled in corresponding MD simulations. We find that our 3D CNN model, which we call SolventNet, predicts experimental reaction rates more accurately than models based on human-selected, MD-derived descriptors [3] and previously developed 3D CNNs (ORION [19] and VoxNet [20]). Surprisingly, reaction rate predictions with SolventNet require as little as 2 ns of classical MD trajectory data, a 100-fold improvement from the original 205 ns of MD data used in models based on human-selected descriptors [3]. This indicates that 3D atomistic positions embed significant information. We further show that SolventNet generalizes to new system compositions using leave-one-out cross-validation in which all data for a cosolvent-water mixture or reactant were treated as the test set and excluded from model training. Finally, we tested the predictive power of SolventNet for reactants in three additional polar aprotic cosolvents not included in model training: dimethyl sulfoxide, acetonitrile, and acetone. SolventNet still accurately predicts experimentally measured reaction rates in solvent mixtures containing these cosolvents despite their distinct properties (e.g., functional groups, basicity, and polarizability). To our knowledge, this work is the first to integrate 3D CNNs and classical MD simulations for the prediction of acid-catalyzed reaction rates. We envision that the computational efficiency associated with the combination of 3D CNNs and classical MD simulations will enable the integration of these tools with process models to screen solvents and optimize reactor conditions for biomass conversion processes [21].

[1] L. Shuai and J. Luterbacher, Chemsuschem, 2016, 9, 133-155.

[2] M. A. Mellmer, C. Sener, J. M. R. Gallo, J. S. Luterbacher, D. M. Alonso and J. A. Dumesic, Angew Chem Int Edit, 2014, 53, 11872-11875.

[3] T. W. Walker, A. K. Chew, H. X. Li, B. Demir, Z. C. Zhang, G. W. Huber, R. C. Van Lehn and J. A. Dumesic, Energy & Environmental Science, 2018, 11, 617-628.

[4] J. J. Varghese and S. H. Mushrif, Reaction Chemistry & Engineering, 2019, 4, 165-206.

[5] S. H. Mushrif, S. Caratzoulas and D. G. Vlachos, PCCP, 2012, 14, 2637-2644.

[6] A. K. Chew and R. C. Van Lehn, Front Chem, 2019, 7, 439.

[7]Connor W. Coley, W. Jin, L. Rogers, T. F. Jamison, T. S. Jaakkola, W. H. Green, R. Barzilay and K. F. Jensen, Chem Sci, 2019, 10, 370-377.

[8] D. K. Duvenaud, D. Maclaurin, J. Iparraguirre, R. Bombarell, T. Hirzel, A. Aspuru-Guzik and R. P. Adams, 2015.

[9] R. Gómez-Bombarelli, J. N. Wei, D. Duvenaud, J. M. Hernández-Lobato, B. Sánchez-Lengeling, D. Sheberla, J. Aguilera-Iparraguirre, T. D. Hirzel, R. P. Adams and A. Aspuru-Guzik, ACS central science, 2018, 4, 268-276.

[10] N. E. Jackson, A. S. Bowen, L. W. Antony, M. A. Webb, V. Vishwanath and J. J. de Pablo, Sci Adv, 2019, 5, eaav1190.

[11] E. Y. Lee, B. M. Fulan, G. C. L. Wong and A. L. Ferguson, Proceedings of the National Academy of Sciences, 2016, 113, 13588-13593.

[12] Z. Wu, B. Ramsundar, E. N. Feinberg, J. Gomes, C. Geniesse, A. S. Pappu, K. Leswing and V. Pande, Chem Sci, 2018, 9, 513-530.

[13] S. Chmiela, A. Tkatchenko, H. E. Sauceda, I. Poltavsky, K. T. Schütt and K.-R. Müller, Sci Adv, 2017, 3, e1603015.

[14] W. Rawat and Z. H. Wang, Neural Comput, 2017, 29, 2352-2449.

[15] R. D. Singh, A. Mittal and R. K. Bhatia, Multimedia Tools and Applications, 2019, 78, 15951-15995.

[16] W. Torng and R. B. Altman, Bioinformatics, 2018, 35, 1503-1512.

[19] N. Sedaghat, M. Zolfaghari, E. Amiri and T. Brox, arXiv preprint arXiv:1604.03351, 2016.

[20] D. Maturana and S. Scherer, 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2015, 922-928.

[21] D. M. Alonso, S. H. Hakim, S. Zhou, W. Won, O. Hosseinaei, J. Tao, V. Garcia-Negron, A. H. Motagamwala, M. A. Mellmer, K. Huang, C. J. Houtman, N. LabbÃ©, D. P. Harper, C. T. Maravelias, T. Runge and J. A. Dumesic, Sci Adv, 2017, 3, e1603301.

Topics

Computational Molecular Engineering

Computing and Systems Engineering

Other Sites & Tools

Technical Groups

Technical

Professional/Personal Growth

Societal Needs

Leadership

2025 Spring Meeting and 21st Global Congress on Process Safety

2025 AIChE Annual Meeting

Upcoming Conferences & Events

CEP: January 2025

CEP: December 2024

Explore Areas of Advancement:

Learning Center:

Want to be an Entrepreneur? Personal Stories From Three Successful Entrepreneurs Who Have Traveled This Path.

(219h) Characterizing Complex Solvent Environments in Acid-Catalyzed Reactions Using Molecular Dynamics Simulations and 3D Convolutional Neural Nets

AIChE Annual Meeting

2020

2020 Virtual AIChE Annual Meeting

Topical Conference: Applications of Data Science to Molecules and Materials

Applications of Data Science in Catalysis and Reaction Engineering III

Tuesday, November 17, 2020 - 9:15am to 9:30am

Authors

Topics

More Conference Links

Contact Us

Cancellation Policy

Code of Conduct

Beware of Hotel and Attendee-list Scams

Code of Conduct

Beware of Hotel and Attendee-list Scams