(184u) Machine Learning and Artificial Intelligence for Formulation Optimization | AIChE

(184u) Machine Learning and Artificial Intelligence for Formulation Optimization

Formulation of oral solids is a challenging topic because of the difficulty of dealing with powder properties alone. The number of materials available for use, including the variations on their manufacture and combination further complicates the task. When formulating a new molecular entity, the most important ingredient is the least well characterized. Additional considerations for the stability, release rate, size, shape, swallowability, identifying markings or even ability to split the tablet can be added to the amount of information to be processed.

While with modern techniques of statistically designed experiments, material property characterization, advanced analytical technology the data is more meaningful, the size of the dataset and the number of factors to consider only grows. Even after a lifetime of experience a formulation team often resorts to analog neural nets of what worked before, its command of the graphs they have absorbed from the literature, and the limited experimentation they have resources for.

Recent advances in computing power have fueled new capabilities in machine learning (ML) and artificial intelligence (AI). The digital neural net can process more factors in minutes than the analog net is capable of after years of training. New insights from the ML/AI approach provide a faster path to selection, optimization, and better use resources in implementing formulation designs.

This work describes a collaboration leveraging prior knowledge of formulation design and SuntheticsML, a user-friendly ML platform, to leverage a training set of data from a matrix controlled release oral solid dosage form. SuntheticsML is an accessible online ML tool tailored for researchers without coding or ML expertise. The model platform was intuitive, requiring only an introduction to the major features and some practical use to learn. Although specialized training would be valuable in understanding all the features available, value could be derived on essentially a walk up basis from someone experienced in the basics of formulation and simple statistical designs or models.

SuntheticsML provides researchers with versatility across applications in formulation optimization, proficiently handling numeric, discrete, and mixed-integer optimization problems with up to 20 input parameters. This ML-powered approach allows flexible execution in serial or parallel experimentation, accommodating any user-specified budget. Furthermore, it facilitates bounded-target, multi-objective, and constrained-input optimizations, enabling simultaneous enhancements in cost and material efficiency.

The composition was studied across two actives of similar properties and in two regions of release rate of interest. The space encompassed 8 excipients, and multiple dosage form presentations. The training set was 19 laboratory scale (few hundred gram to 1 kg scale) experiments. The experimental data was at hand from different sources including small designed studies or initial efforts at formulation. The Sunthetics ML/AI platform for composition and process data science was used to create an ML/AI model to generate insights and predictions from the training set and establish the design space.

Use of the ML/AI model enabled finding the right operating region to achieve the tablet performance and optimized that performance to a goal provided by the clinical team more quickly. Specifically, the platform showed that several different grades of excipients could be used to achieve comparable results opening the possibility to supply chain and formulation preferences to achieve goals. A key value of the platform is the built-in visualization tools. The visualization toolbox allows the team to look at the data from several perspectives thus assembling an understanding of multiple variable interactions gaining new insights into the drivers for critical quality attributes. A slide bar function gives immediate feedback to likely outcome of proposed formulation and can aid team discussion when deliberating options. The model also gives several types of suggested optimal points and suggested experiments to improve the reliability of the dataset.

Use of the model eliminated several iterations of designed experiments and led to the identification of a formulation meeting the target profile on an expedited timeline.