(11e) A Flow-Matching Approach for Generative Backmapping of Biomolecules | AIChE

(11e) A Flow-Matching Approach for Generative Backmapping of Biomolecules

Authors 

Ferguson, A., University of Chicago
Backmapping is the process of recovering all-atom detail from a coarse-grained simulation. In addition to generating physically plausible structures, a robust backmapping approach should produce an ensemble of structures that reflects the diversity of the one-to-many transformation. We leverage the recently developed flow-matching paradigm to learn a vector field that transforms a prior distribution representing coarse-grained structures to a data distribution representing all-atom structures. We build our prior by mapping atoms to their nearest coarse-grained bead and normally distributing their positions around that bead. We then train a time-conditioned equivariant graph neural network on the linear interpolation between the prior and ground-truth structures. Finally, we then leverage the learned vector field to integrate an ordinary differential equation that transforms out-of-sample coarse-grained samples into corresponding all-atom ensembles. We demonstrate our model on both protein trajectories and on DNA-protein trajectories using a 3-site-per-nucleotide coarse-grained mapping. Our results suggest a backmapping paradigm that is both generalizable across classes of biomolecules as well as across coarse-grained models.