(39g) Developing Digital Twins for Pharmaceutical Crystallization Processes Using Machine Learning-Based Strategies
AIChE Annual Meeting
2024
2024 AIChE Annual Meeting
Pharmaceutical Discovery, Development and Manufacturing Forum
Data-driven approaches, and ML/AI for pharmaceutical applications
Sunday, October 27, 2024 - 5:36pm to 5:57pm
While a notable surge in literature demonstrates proof-of-concept studies applying data-driven strategies for modeling and controlling crystallization processes, a significant research gap persists, particularly concerning pharmaceutical APIs. This gap is characterized by limited experimental connections, either through training models with experimental data or validating models against experimental results.2 Thus, the primary objective of this research is to address this gap by devising systematic frameworks for constructing data-driven digital twins from experimental data and assessing their effectiveness for robust design and control of pharmaceutical crystallization processes.
The framework development was initiated using a model compound from the literature with a needle-shaped morphology described using a complex two-dimensional PBM model.3 The data-driven digital twin development was conducted in phases mirroring a typical pharmaceutical API development procedure followed in the industry: small-scale Crystalline experiments, lab-scale crystallization experiments, and pilot-scale experiments. For each phase, the training data used for building the data-driven models was constructed using specific experimental information available during each phase. For example, only the operating conditions and crystal quality attribute information (yield and CSD in both length and width directions) were used in the first phase, while the second stage additionally consisted of calibrated process analytical technology (PAT) data from experiments done at lab-scale. The machine learning development workflow included key steps such as data pre-processing, evaluation of different model architectures, optimization of various hyper-parameter settings, and model testing. To address the commonly observed training data-scarcity challenge associated with developing data-driven digital twins from experimental data, state-of-the-art deep learning-based generative modeling techniques were evaluated and used for generating synthetic experimental data from real data. Furthermore, augmentation of real and synthetic data was performed to enhance the training process of neural networks, thereby improving their prediction capabilities. Lastly, the developed data-driven digital twin framework was applied and validated using the experimental data from a real-life API crystallization process.
Acknowledgement:
Funding from Takeda Pharmaceuticals International Co. is gratefully acknowledged.
References:
- Nagy, Z. K. & Braatz, R. D. Advances and new directions in crystallization control. Annu. Rev. Chem. Biomol. Eng. 3, 55â75 (2012).
- Xiouras, C., Cameli, F., Quilló, G. L., Kavousanakis, M. E., Vlachos, D. G. & Stefanidis, G. D. Applications of Artificial Intelligence and Machine Learning Algorithms to Crystallization. Chem. Rev. 122, 13006â13042 (2022).
- Wu, W. L., Chappelow, C., Hanspal, N., Larsen, P. A., Patton, J., Shinkle, A. & Nagy, Z. K. Digital Design of an Agrochemical Crystallization Process via Two-Dimensional Population Balance Modeling. Org. Process Res. Dev. 28, 558 (2023).