(468e) ML-Assisted Sequential Design of Experiments for Optimal Media Development | AIChE

(468e) ML-Assisted Sequential Design of Experiments for Optimal Media Development

Authors 

Love, J. C., Massachusetts Institute of Technology
Background

Biopharmaceuticals have increased a lot of attention due to their efficacy and lower toxicity and side reaction, especially for curing complex diseases such as cancer, autoimmune diseases, and inflammatory diseases. However, these therapeutics are expensive and thus are not accessible to the global market. Despite that, there is an exponentially increasing annual demand for them. Thus, it is timely and important to reduce the time and cost of the development and manufacturing of these molecules to increase production and accessibility.

On this front, among several other strategies, a lot of emphases has been laid on modeling efforts at various steps throughout the biopharmaceutical lifecycle such as molecule discovery, process development, optimization, monitoring, and control. The ultimate goal is to use these models as digital surrogates of the system such that computational and in-silico screening can be performed to narrow down the number of experiments to be done in the actual experimental system, thus immensely reducing time and resources.

Despite a considerable fraction of these activities being dedicated to process development and optimization, the focus has been mainly on bioreactor and downstream operating conditions based. Media development, on the other hand, has not received much attention in terms of using Machine Learning (ML) methods to optimize media. Typically, a pre-determined formulation of basal and feed media is used in cell culture/cultivation and a small design of experiments (DoE) is performed for a few additives.

Media optimization is extremely important since it governs the metabolic state of the cells in addition to cell health and secretion capabilities. We propose and highlight the advantage of using the machine learning method to assist a sequential design of experiments such that optimal media can be achieved with a minimal number of experiments and thus correspondingly the overall time and resources. As a proof-of-concept, we illustrate the implementation of an active learning algorithm for optimizing complex media (BMY) for cultivations of Pichia pastoris.

Methodology

The methodology is based on using ML to assist in the design of experiments iteratively based on the data collected in previous sets of experiments. As opposed to statistical DoEs which propose a static design irrespective of the type of input or output, the proposed methodology provides a much smaller set of designs every iteration based on the observations collected in the previous iteration. Additionally, the statistical DoEs are derived assuming linear or quadratic relationships between input and output. This however oversimplifies the complexity of most real-life systems including the cultivations and bioreactors. Finally, owing to the static nature of the statistical DoEs, they explore favorable and unfavorable parts of the design space equally. On the contrary, the proposed algorithm finds a trade-off between characterizing the design space and optimizing the target of interest 1,2. Figure 1 provides a schematic representation of the proposed algorithmic framework.

Figure 1: Schematic representation of the ML-assisted sequential design of experiments.

Results

We have applied this strategy for media development for microbial cultivation focusing on protein production. In particular, we use Pichia pastoris as the host organism which has been considered and argued as one of the promising alternative hosts3,4. As opposed to CHO cultures, microbial cultivations are still largely focused on using complex media: BMxY – Buffer minimal yeast media. Here, x indicates the carbon source used. Typically, glycerol is used as the carbon source in the biomass accumulation phase and methanol is used as the carbon source for the production phase (protein is typically integrated at pAOX1 methanol-induced promoter)5,6. Additional co-feeds have been explored such as sorbitol. However, the recipe for media preparation including the type and concentration of the carbon sources is fixed.

As a proof-of-concept, we applied our algorithm to complex media formulation co-varying the type and amount of carbon source and the amount of inducer (methanol in this case). We applied this to 2 different molecules with different complexity and features: namely a covid-19 RBD domain and Human Serum Albumin(HSA).

  1. For the 2 molecules, we observe that the optimized media could increase the specific productivity and titer by 2 – 3 times.
  2. This could be achieved with as low as 70 experiments which is 4 orders of magnitude lower than a full screen and 3-4 times lower than statistical DoE (that are sub-optimal).
  3. The optimized media looks extremely different for the two molecules indicating that the bottlenecks for producing different molecules are different and thus require different supplements.

Next Steps

  1. Currently, we are applying the strategy to optimize media for monoclonal antibody and nanobody production.
  2. As a proof-of-concept, we have applied the algorithm only for the protein production phase and not for the biomass accumulation phase. Expanding to that is likely to provide us with an additional boost in productivity.
  3. Finally, applying this strategy to design chemically defined media for microbial cultivations is to be targeted.
  4. In parallel, we are using RNA Sequencing to interpret what changes have been triggered in the host compared to the benchmark conditions that have resulted in favorable outcomes.

References

  1. Narayanan, H. et al. Design of Biopharmaceutical Formulations Accelerated by Machine Learning. Molecular Pharmaceutics 18, 3843–3853 (2021).
  2. Bader, J., Narayanan, H., Arosio, P. & Leroux, J.-C. Improving extracellular vesicles production through a Bayesian optimization-based experimental design. European Journal of Pharmaceutics and Biopharmaceutics 182, 103–114 (2023).
  3. Love, K. R., Dalvie, N. C. & Love, J. C. The yeast stands alone: the future of protein biologic production. Current Opinion in Biotechnology 53, 50–58 (2018).
  4. Brady, J. R. & Love, J. C. Alternative hosts as the missing link for equitable therapeutic protein production. Nature Biotechnology 39, 404–407 (2021).
  5. EasySelectTM Pichia.
  6. Julien, C. Production of Humanlike Recombinant Proteins in Pichia pastoris. (2006).