(383h) Machine Learning and Natural Language Processing for Pharmaceutical Product Engineering
AIChE Annual Meeting
2016
2016 AIChE Annual Meeting
Process Development Division
Tools and Techniques for Product Design
Tuesday, November 15, 2016 - 2:50pm to 3:10pm
Pharmaceutical product engineering is a â??Big Dataâ? discipline. It requires understanding of details of the drug chemistry during production and within the body, the manufacturing processes and conditions, and the pharmacokinetics of a disease â?? all data intensive. In fact, a typical New Drug Application (NDA) contains more than 100,000 pages of a variety of information. In this talk, we present a framework, called HOLMES, for the automatic extraction of knowledge from primary sources related to pharmaceutical product engineering. The information extracted is then stored in ontologies. These ontologies are a computer readable semantic knowledge representation used in artificial intelligence. We describe Machine Learning (ML) algorithms and Natural Language Processing (NLP) techniques that are used in HOLMES for Entity and Concept Recognition and Relation Extraction. We will discuss our progress on the creation of an entity-concept-and-relation databank (7968 entities and concepts, 1665 relations); the application of different ML algorithms for the purpose of joint Entity and Concept detection; and the development of a relation clustering algorithm using common feature sets.