(252b) Automated Single-Cell RNA Sequencing Analysis Supported By Large Language Models
AIChE Annual Meeting
2024
2024 AIChE Annual Meeting
Computing and Systems Technology Division
10D: Applied Math for Biological and Biomedical Systems
Tuesday, October 29, 2024 - 8:18am to 8:36am
Large Language Models (LLMs), such as OpenAI's GPT-4 [3], excel at generating and understanding human-like text. They are trained on large datasets and relevant literature, they have shown potential in scientific research, and their deep understanding of natural language enables innovative applications in biomedical sciences, such as automating data interpretation and analysis, which can assist both experts and non-experts in the field.
In this work, we present scChat, an innovative scRNA-seq analysis platform that leverages the power of LLMs to streamline the clustering and annotation of cells. scChat is designed to make advanced scRNA-seq analysis accessible to everyone, removing barriers for those without a background in biology or programming. In addition, by combining the research context with the statistics of the scRNA-seq data, scChat provides detailed biological explanations and experimental suggestions using the deep biological knowledge of the LLMs. Through an easy-to-use chat interface, users can upload their scRNA-seq datasets and engage in an intuitive dialog to automatically annotate cells and receive detailed annotation plots. In case of unsatisfactory results, scChat offers the flexibility for further interactive analysis, acting as a virtual research assistant. This platform not only accelerates data analysis, but also provides in-depth explanations and insights into the annotations, facilitating the discovery of changes in cell populations across samples. Such insights are critical for understanding disease progression and identifying effective therapeutic targets based on cellular and genetic markers. scChat is a transformative tool that enables researchers to gain a deeper understanding of disease mechanisms and improve the development of targeted treatment strategies.
The integration of LLMs not only enables accurate interpretation of complex biological data, but also supports experimental design and hypothesis generation, providing significant benefits to disease research and treatment. Through our platform, scChat, we aim to accelerate the discovery process and contribute to the advancement of personalized medicine, ultimately paving the way for improved healthcare outcomes.
[1] Luecken, Malte D., and Fabian J. Theis. "Current best practices in singleâcell
RNAâseq analysis: a tutorial." Molecular systems biology 15.6 (2019): e8746.
[2] Potter, S. Steven. "Single-cell RNA sequencing for the study of development, physiology and disease." Nature Reviews Nephrology 14.8 (2018): 479-492.
[3] Achiam, Josh, et al. "Gpt-4 technical report." arXiv preprint arXiv:2303.08774 (2023).
[4] Hou, Wenpin, and Zhicheng Ji. "Assessing GPT-4 for cell type annotation in single-cell RNA-seq analysis." Nature Methods (2024): 1-4.