(140d) Applying Data Science Techniques to Solubility Data for Synthetic Compounds: An Expedited End-to-End Workflow from Data Collection to Crystallization Process Design

Conference

AIChE Annual Meeting

Year

2018

Proceeding

2018 AIChE Annual Meeting

Group

Pharmaceutical Discovery, Development and Manufacturing Forum

Session

Data Analytics for Process Prediction

Time

Tuesday, October 30, 2018 - 1:45pm to 2:10pm

Authors

Lovette, M. - Presenter, Amgen Inc.

Huggins, S., Amgen Inc.

Crystallization is an important process often employed several times within synthetic routes for drug substances. The development of robust crystallization processes can be separated into two tasks: (1) the selection of a solvent-system, and (2) the design of a process within that solvent system that affords materials with desired chemical/physical purity and meets yield requirements. An end-to-end workflow relying on data science techniques was developed for capturing, visualizing, and interpreting solubility data to facilitate the consistent and rapid execution of these tasks.

This workflow starts with the collection of solubility data using standardized equipment sets and approaches into templated tables. These tables contextualize the solubility data by joining each measured value (concentration, temperature, composition, etc.) with relevant meta-data (solute purity, x-ray diffraction results, equipment, date, etc.). The contextualized solubility data is ingested within a database â€“ providing a single source for all solubility data. A templated visualization then consumes the data from this source. Further it can be filtered within the visualization as necessary (e.g., limited to a specific solute and limited to only data collected for a specific lot of solute).

To facilitate solvent selection as the first task in crystallization process development, this visualization automatically applies a decision tree to collected solubility data to classify solvents with regards to crystallization as solvents that are likely: â€œgood for a thermal processâ€, â€œsolvent within antisolvent driven crystallizationâ€, or â€œgood antisolventsâ€. Based on this classification the scientist working with the system can either begin the task of process design or apply a predictive solubility model that has been integrated within the visualization to determine other solvent systems that may be worth investigating. The application of this model uses simple R scripts with open source libraries (e.g., non-linear optimization packages) doing the â€œheavy-liftingâ€.

Once a solvent system has been selected, the task of process design begins using an automated script to fit and select the best-model for solubility data across ranges of temperature/antisolvent ratios within that system. The contextualization of the solubility data allows for visual identification and rapid exclusion of outliers within the model fitting step. Once a solubility model has been selected and fit for a given system â€“ a constrained optimization algorithm is applied to determine the process that affords the highest yield given user supplied constraints (which are modified to ensure the process meets desired chemical/physical purity needs). This initial crystallization process conditions are then attempted and refined as necessary.

This end-to-end workflow has resulted in significant time savings, and allowed for the setting of consistent expectations for initial designs across projects. Further, it is an early demonstration of the integration of modelling and data science techniques within process development.

Topics

Crystallization

Other Sites & Tools

Technical Groups

Technical

Professional/Personal Growth

Societal Needs

Leadership

2025 Spring Meeting and 21st Global Congress on Process Safety

2025 AIChE Annual Meeting

Upcoming Conferences & Events

CEP: November 2024

CEP: October 2024

Explore Areas of Advancement:

Learning Center:

Want to be an Entrepreneur? Personal Stories From Three Successful Entrepreneurs Who Have Traveled This Path.

(140d) Applying Data Science Techniques to Solubility Data for Synthetic Compounds: An Expedited End-to-End Workflow from Data Collection to Crystallization Process Design

AIChE Annual Meeting

2018

2018 AIChE Annual Meeting

Pharmaceutical Discovery, Development and Manufacturing Forum

Data Analytics for Process Prediction

Tuesday, October 30, 2018 - 1:45pm to 2:10pm

Authors

Topics

More Conference Links

Cancelation Policy

Code of Conduct

Beware of Hotel and Attendee-list Scams