(455c) Accelerated Discovery of High-Performance Organic Electrode Materials for Battery Applications Using an Interpretable Machine Learning Framework | AIChE

(455c) Accelerated Discovery of High-Performance Organic Electrode Materials for Battery Applications Using an Interpretable Machine Learning Framework

Authors 

Muthyala, M., The Ohio State University
Park, J., The Ohio State University
Zhang, S., The Ohio State University
Paulson, J., The Ohio State University
Conventional battery electrode materials rely on finite transition metals, which can diminish both the economic and environmental value of battery energy storage systems [1]. In contrast, organic electrode materials (OEMs), derived from abundant and readily available elements, offer a more sustainable energy storage solution. Despite the extensive structural diversity of OEMs, which enables tailored molecule design for specific properties, navigating this diversity can be challenging. The relationship between structure and battery properties is often unclear, and existing proxies for these properties usually necessitate costly ab initio calculations. Moreover, designing OEMs is complicated by the need for molecules to meet multiple objectives and possess feasible synthetic pathways [2].

Recently, there has been a shift from traditional first-principles models towards more versatile data-driven modeling paradigms. The latter methods are advantageous in that they can uncover complex (nonlinear) relationships between molecular encodings and material properties. A variety of different model structures, such as neural networks, random forests, and support vector machines, have been explored for molecular property modeling across different areas of chemistry and material science, e.g., [3, 4, 5]. Although these methods alleviate the need for domain expertise and/or trial-and-error Edisonian search methods, they typically result in limited insights into the underlying physics (due to their black-box nature) and often require large training datasets to generalize to unseen test data to an acceptable level.

To accelerate the discovery of high-performance OEMs (in terms of high energy density and long cycle life), we developed a new modeling framework referred to as SPARKLE (Symbolic Predictive Algorithm for Recognizing Key Molecular Elements). We applied SPARKLE to a large class (nearly 1 million) quinone derivatives by training interpretable predictive models for the specific energy and solvation energy in terms of a small set of learned molecular descriptors. These models were trained on a small set of density functional theory (DFT) data for ~100 quinones using a tailored symbolic regression method [6]. Each model is composed on four or less descriptors, all of which have an explainable structure that provides insights into what molecular features contribute most to the properties of interest. The learned models achieve high R2 values on held-out test data as well as R2 > 0.7 on a set of >100,000 quinones whose properties were computed using a different DFT method [7]. Applying these models to a set of nearly 1M molecules (obtained using a molecular generator [8]), we were able to identify many candidate OEMs that had the potential to outperform the current state of the art. To more systematically filter this space, we developed a new synthetic accessibility metric that modifies the score from [9] to account for symmetry of the molecules. By analyzing the tradeoff between specific energy, solvation energy, and synthetic complexity, a final set of 27 candidates OEMs were synthesized and manufactured into zinc-ion cathodes for testing in a coin cell battery. We found that 62.9% of these candidates satisfied key battery performance targets, which is 3x higher than the collective success rate achieved by the Zhang lab over the past six years (only 20.8% success across 72 tested molecules chosen by expert intuition). Moreover, 5 of the 27 compounds were found to be stable up to 1500 cycles with a capacity retention of 70.4%, which is on par with the current state of the art OEMs. By accounting for synthesizability, SPARKLE-identified OEMs are also likely to have simpler synthesis pathways, which is critical for achieving low manufacturing costs in practice.

References:

[1] Liang, Y., Jing, Y., Gheytani, S., Lee, K. Y., Liu, P., Facchetti, A., & Yao, Y. (2017). Universal quinone electrodes for long cycle life aqueous rechargeable batteries. Nature materials, 16(8), 841-848.

[2] Liang, Y., & Yao, Y. (2023). Designing modern aqueous batteries. Nature Reviews Materials, 8(2), 109-122.

[3] Jin, W., Barzilay, R., & Jaakkola, T. (2018, July). Junction tree variational autoencoder for molecular graph generation. In International conference on machine learning (pp. 2323-2332). PMLR.

[4] Random Forest model with combined features: A practical approach to predict liquid‐crystalline property

[5] Jorissen, R. N., & Gilson, M. K. (2005). Virtual screening of molecular databases using a support vector machine. Journal of chemical information and modeling, 45(3), 549-561.

[6] Ouyang, R., Curtarolo, S., Ahmetcik, E., Scheffler, M., & Ghiringhelli, L. M. (2018). SISSO: A compressed-sensing method for identifying the best low-dimensional descriptor in an immensity of offered candidates. Physical Review Materials, 2(8), 083802.

[7] Flores, S. P., Martin-Noble, G. C., Phillips, R. L., & Schrier, J. (2015). Bio-Inspired electroactive organic molecules for aqueous redox flow batteries. 1. Thiophenoquinones. J. Phys. Chem. C, 119(38), 21800-21809.

[8] Berenger, F., & Tsuda, K. (2021). Molecular generation by Fast Assembly of (Deep) SMILES fragments. Journal of Cheminformatics, 13, 1-10.

[9] Coley, C. W., Rogers, L., Green, W. H., & Jensen, K. F. (2018). SCScore: synthetic complexity learned from a reaction corpus. Journal of chemical information and modeling, 58(2), 252-261.