(208c) Self-Supervised Learning Methods for Drug Substance and Drug Product Characterization in the Pharmaceutical Industry

Conference

AIChE Annual Meeting

Year

2022

Proceeding

2022 Annual Meeting

Group

Pharmaceutical Discovery, Development and Manufacturing Forum

Session

Enabling Technologies: Mechanistic and statistical modeling

Time

Monday, November 14, 2022 - 4:12pm to 4:33pm

Authors

Salami, H. - Presenter, Georgia Institute of Technology

Skomski, D., Merck & Co. inc.

Machine learning methods have been applied to a variety of problems in chemical engineering research and development. Among these are data-driven, neural network-based methods that are the standards for most computer vision related tasks. Typically in the form of convolutional networks, these are useful tools for various tasks such as classification and segmentation. In the context of the pharmaceutical industry, such tasks include analyzing raw image data generated by testing samples of drug product and drug substance for different modalities. Examples include microfluidic or powder-dispersed optical imaging data (for characterizing particles in sterile liquid formulations and oral formulations) as well as data generated from in situ cameras in crystallization vessels. Naturally, such characterizations have important implications regarding the regulatory aspects of product development.

Data-driven in nature, these models usually rely on large amounts of data to achieve goals such as classifying subvisible particles in a solution or detecting extraneous matter or impurity crystals in a vessel. However, training these models for such tasks requires labeled data that needs to be prepared by a human user, which can be a tedious task and very time consuming. In this talk, we will discuss how one can leverage a family of self-supervised or weakly supervised learning methods to facilitate performing speedy training tasks and thereby accelerate practical applications. These methods include autoencoder-based and contrastive learning-based approaches. In essence, the methods are built on the idea to invoke the networks to perform a pre-text task in which they learn the most important features of the available data without relying on labels provided by an operator. We will discuss applying such approaches to characterizing different systems from protein aggregates in sterile liquid formulations to impurity particles in small-molecule crystallization processes.

Topics

Pharmaceutical Development

Other Sites & Tools

Technical Groups

Technical

Professional/Personal Growth

Societal Needs

Leadership

2025 Spring Meeting and 21st Global Congress on Process Safety

2025 AIChE Annual Meeting

Upcoming Conferences & Events

CEP: December 2024

CEP: November 2024

Explore Areas of Advancement:

Learning Center:

Want to be an Entrepreneur? Personal Stories From Three Successful Entrepreneurs Who Have Traveled This Path.