(88d) A Knowledge-Graph-Based Pharmaceutical Engineering Chatbot for Drug Discovery
AIChE Annual Meeting
2024
2024 AIChE Annual Meeting
Pharmaceutical Discovery, Development and Manufacturing Forum
Advances in Drug Discovery Processes (including HTE): Advanced Technology Approaches to Maximize Public Health Impacts
Monday, October 28, 2024 - 9:03am to 9:24am
Despite their great success in day-to-day applications, ChatGPT and other large language models (LLMs) have not covered as much ground in scientific and engineering domains. One key challenge is the abundance of domain-specific terminology, which an LLM is not trained to extract in accordance with the underlying physical laws. This can lead to unreliable results or hallucinations. To address these challenges, we have developed SUSIE, an ontology-based pharmaceutical information extraction tool that is built to extract semantic triples and present them to the user as knowledge graphs (KGs) 1,2. While KGs help visualize the relationships between different entities, they are not easily accessible for user questions, yet serve as structured inputs for LLMs. Thus, KGs can be used to efficiently query a corpus of pharmaceutical documents, streamlining drug discovery and manufacturing processes. In this work, we present a customized question-and-answering module that enables the user to query from generated KGs, and get an answer in natural language. We show that with the integration of prompt engineering into a model based on the set of rules defined by the ontology, the output is sensible and explainable.
Bibliography:
(1) Mann V., Viswanath S., Vaidyaraman S., Balakrishnan J., Venkatasubramanian V., (2023).
SUSIE: Pharmaceutical CMC ontology-based information extraction for drug development using machine learning,
Computers & Chemical Engineering, Volume 179, 108446.
https://doi.org/10.1016/j.compchemeng.2023.108446.
(2) Remolona M. F. M., Conway M. F. , Balasubramanian S., Fan L., Feng Z., Gu T., Kim H., Nirantar P. M. , Panda S., Ranabothu N. R. , Rastogi N., Venkatasubramanian V., (2017).
Hybrid ontology-learning materials engineering system for pharmaceutical products: Multi-label entity recognition and concept detection.
Computers & Chemical Engineering, Volume 107, Pages 49-60.