(88a) A Bayesian Experimental Design Framework for Optimizing Microbial Communities
AIChE Annual Meeting
2022
2022 Annual Meeting
Topical Conference: Applications of Data Science to Molecules and Materials
Applications of Data Science to High Throughput Experimentation
Monday, November 14, 2022 - 8:00am to 8:15am
In summary, in this work, we present a Bayesian design-of-experiments framework for modeling and optimizing microbial communities directly from data. Our framework includes a recurrent neural network (RNN) architecture tailored to model microbial community dynamics directly from data, a Bayesian inference method for parameter estimation and quantification of prediction uncertainty, and a model-guided optimization approach to select batches of microbial communities that collectively maximize information content and functional design objectives.
Interactions between microbial species are complex and currently not well understood, which necessitates flexible modeling approaches that learn how species interact from experimental data. Machine learning methods that model the dynamics of microbial species abundance over time, such as recurrent neural networks, are thus compelling approaches; however, they can produce physically unrealistic predictions, such as negative species abundances or the prediction of a species appearing despite not being initially present in the community. To overcome this limitation, we present an RNN that we call Microbiome Recurrent Neural Network (MRNN), a modified RNN architecture that eliminates the possibility of predicting physically unrealistic species abundances and metabolite concentrations. Because biological data sets are typically limited to a small collection of noisy observations, we present a rigorous, automated approach to optimize the degree of model regularization to avoid over-fitting [8]. Furthermore, data acquisition is often time-consuming and expensive. Consequently, the selection of an informative set of experiments is crucial for developing models that capture system properties while minimizing time and resources spent on performing experiments [9]. We adopt a Bayesian experimental design strategy to optimize dynamic biological systems using RNNs.
Once fit to experimental data, the MRNN is used to make probabilistic predictions of previously unobserved experimental conditions. Using a utility function that balances exploration with exploitation, a subset of experimental conditions can be proposed for future experiments. We show that the MRNN outperforms a more flexible machine learning model in the prediction of species abundances over time using ground truth species abundance data simulated from a generalized Lotka-Volterra model. We then show that the MRNN accurately predicts confidence intervals of species abundance and metabolite concentration for an experimental dataset that contains 95 different microbial communities composed of unique subsets of 25 species. To demonstrate the ability of the framework to seek informative experimental designs that optimize a time-dependent objective, we show that the MRNN can be applied to increase the abundance of a set of beneficial microbial species using a ground truth resource-competition model.
References:
[1] Scarborough MJ, Lynch G, Dickson M, McGee M, Donohue TJ, Noguera DR. Increasing the economic value of lignocellulosic stillage through medium-chain fatty acid production. Biotechnology for biofuels. 2018;11(1):1â17.
[2] Agler MT, Spirito CM, Usack JG, Werner JJ, Angenent LT. Chain elongation with reactor microbiomes: upgrading dilute ethanol to medium-chain carboxylates. Energy & Environmental Science. 2012;5(8):8189â8192.
[3] Kaul S, Choudhary M, Gupta S, Dhar MK. Engineering host microbiome for crop improvement and sustainable agriculture. Frontiers in Microbiology. 2021;12:1125.
[4] Loffler FE, Edwards EA. Harnessing microbial activities for environmental cleanup. Current Opinion in Biotechnology. 2006;17(3):274â284.
[5] Lawson CE. Retooling Microbiome Engineering for a Sustainable Future. Msystems. 2021;6(4):e00925â21.
[6] Clark RL, Connors BM, Stevenson DM, Hromada SE, Hamilton JJ, Amador-Noguez D, et al. Design of synthetic human gut microbiome assembly and butyrate production. Nature communications. 2021;12(1):1â16.
[7] Stein RR, Tanoue T, Szabady RL, Bhattarai SK, Olle B, Norman JM, et al. Computer-guided design of optimal microbial consortia for immune system modulation. Elife. 2018;7:e30916.
[8] Bishop CM. Pattern recognition. Machine learning. 2006;128(9).
[9] Box GE, Lucas H. Design of experiments in non-linear situations. Biometrika. 1959;46(1/2):77â9