(174bz) Scchat: An AI Copilot for Analyzing Single-Cell RNA Sequencing Data | AIChE

(174bz) Scchat: An AI Copilot for Analyzing Single-Cell RNA Sequencing Data

Authors 

Bao, X., Purdue University
Varghese, A. J., Purdue University
Nahar, R., Purdue University
Chen, H., Purdue University
Shao, K., Purdue University
Single-cell RNA sequencing (scRNA-seq) has emerged as a revolutionized tool for understanding the intricate cellular dynamics and heterogeneity inherent in biological systems, providing a detailed and high-resolution view of the transcriptomic landscape at the single-cell level, enabling us to gain insight into molecular heterogeneity and reveal changes in cell populations during disease progression and in response to treatment. [1-2].

Large Language Models (LLMs), such as OpenAI's GPT-4 [3], excel at generating and understanding human-like text. They are trained on large datasets and relevant literature, they have shown potential in scientific research, and their deep understanding of natural language enables innovative applications in biomedical sciences, such as automating data interpretation and analysis, which can assist both experts and non-experts in the field.

In this work, we present scChat, an innovative scRNA-seq analysis platform that leverages the power of LLMs to streamline the clustering and annotation of cells. scChat is designed to make advanced scRNA-seq analysis accessible to everyone, removing barriers for those without a background in biology or programming. In addition, by combining the research context with the statistics of the scRNA-seq data, scChat provides detailed biological explanations and experimental suggestions using the deep biological knowledge of the LLMs. Through an easy-to-use chat interface, users can upload their scRNA-seq datasets and engage in an intuitive dialog to automatically annotate cells and receive detailed annotation plots. In case of unsatisfactory results, scChat offers the flexibility for further interactive analysis, acting as a virtual research assistant. This platform not only accelerates data analysis, but also provides in-depth explanations and insights into the annotations, facilitating the discovery of changes in cell populations across samples. Such insights are critical for understanding disease progression and identifying effective therapeutic targets based on cellular and genetic markers. scChat is a transformative AI copilot that enables researchers to gain a deeper understanding of disease mechanisms and improve the development of targeted treatment strategies.

The integration of LLMs not only enables accurate interpretation of complex biological data, but also supports experimental design and hypothesis generation, providing significant benefits to disease research and treatment. Through our platform, scChat, we aim to accelerate the discovery process and contribute to the advancement of personalized medicine, ultimately paving the way for improved healthcare outcomes.

[1] Luecken, Malte D., and Fabian J. Theis. "Current best practices in single‐cell

RNA‐seq analysis: a tutorial." Molecular systems biology 15.6 (2019): e8746.

[2] Potter, S. Steven. "Single-cell RNA sequencing for the study of development, physiology and disease." Nature Reviews Nephrology 14.8 (2018): 479-492.

[3] Achiam, Josh, et al. "Gpt-4 technical report." arXiv preprint arXiv:2303.08774 (2023).

[4] Hou, Wenpin, and Zhicheng Ji. "Assessing GPT-4 for cell type annotation in single-cell RNA-seq analysis." Nature Methods (2024): 1-4.