(197bc) Expanding Bigsmiles for Automated Simulations and Machine Learning Representation of Polymeric Systems
AIChE Annual Meeting
2023
2023 AIChE Annual Meeting
Computational Molecular Science and Engineering Forum
Poster Session: Computational Molecular Science and Engineering Forum
Monday, November 6, 2023 - 3:30pm to 5:00pm
With the expanded BigSMILES, the information from the line notation is sufficient to generate a full polymer system from scratch, including molecular weight distributions, percentages of monomers, reactivity and affinity of the monomers, as well as specifications such as solvents and mixtures. This enables the detailed atomistic generation of an ensemble of polymer molecules, providing a starting point for molecular dynamics simulations and the building of digital twins.
This expanded notation can also be used to generate initial conditions for high-throughput pipelines that analyze polymers using simulations and/or automated experimentation from a single line input prompt, combining completeness and human readability. Moreover, because the expanded BigSMILES describes exactly one ensemble of (random) polymer molecules, it is possible to determine the probability that a given molecule belongs to the described ensemble. This provides a starting point for training auto-encoders to represent these polymer ensembles for machine learning purposes, closing the loop between generating and quantifying molecules.
Overall, the expanded BigSMILES line notation provides a powerful tool for the design and analysis of polymeric systems, allowing for greater automation and efficiency in research and development.