(88d) A Knowledge-Graph-Based Pharmaceutical Engineering Chatbot for Drug Discovery | AIChE

(88d) A Knowledge-Graph-Based Pharmaceutical Engineering Chatbot for Drug Discovery

Authors 

Chakraborty, A., Columbia University In the City of New York
Venkatasubramanian, V., Columbia University
Tsai, C. E., Columbia University
Despite their great success in day-to-day applications, ChatGPT and other large language models (LLMs) have not covered as much ground in scientific and engineering domains. One key challenge is the abundance of domain-specific terminology, which an LLM is not trained to extract in accordance with the underlying physical laws. This can lead to unreliable results or hallucinations. To address these challenges, we have developed SUSIE, an ontology-based pharmaceutical information extraction tool that is built to extract semantic triples and present them to the user as knowledge graphs (KGs) 1,2. While KGs help visualize the relationships between different entities, they are not easily accessible for user questions, yet serve as structured inputs for LLMs. Thus, KGs can be used to efficiently query a corpus of pharmaceutical documents, streamlining drug discovery and manufacturing processes. In this work, we present a customized question-and-answering module that enables the user to query from generated KGs, and get an answer in natural language. We show that with the integration of prompt engineering into a model based on the set of rules defined by the ontology, the output is sensible and explainable.

Bibliography:

(1) Mann V., Viswanath S., Vaidyaraman S., Balakrishnan J., Venkatasubramanian V., (2023).

SUSIE: Pharmaceutical CMC ontology-based information extraction for drug development using machine learning,

Computers & Chemical Engineering, Volume 179, 108446.

https://doi.org/10.1016/j.compchemeng.2023.108446.

(2) Remolona M. F. M., Conway M. F. , Balasubramanian S., Fan L., Feng Z., Gu T., Kim H., Nirantar P. M. , Panda S., Ranabothu N. R. , Rastogi N., Venkatasubramanian V., (2017).

Hybrid ontology-learning materials engineering system for pharmaceutical products: Multi-label entity recognition and concept detection.

Computers & Chemical Engineering, Volume 107, Pages 49-60.

https://doi.org/10.1016/j.compchemeng.2017.03.012.