An Agentic AI Framework for Ingestion and Standardization of Single-Cell RNA-seq Data Analysis
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
The proliferation of publicly available single-cell RNA sequencing (scRNA-seq) data has created significant opportunities in biomedical research. However, the reuse of these resources is constrained by a series of preparatory steps, including metadata extraction from primary literature, retrieval of datasets from corresponding repositories, and the subsequent manual execution of standardized downstream analysis. These tasks often require manual scripting and rely on fragmented workflows, limiting accessibility and increasing turnaround time. To address these challenges, we designed a two-component system consisting of an artificial intelligence (AI) agent coordinating an automated analysis pipeline. CellAtria (Agentic Triage of Regulated single-cell data Ingestion and Analysis) is an agentic AI framework that enables dialogue-driven, document-to-analysis automation through a chatbot interface. Built on a graph-based, multi-actor architecture, CellAtria integrates a large language model (LLM) with tool-execution capabilities to orchestrate the full lifecycle of data reuse. To support downstream analysis, CellAtria incorporates CellExpress, a co-developed pipeline that applies state-of-the-art scRNA-seq processing steps to transform raw count matrices into analysis-ready single-cell profiles. Thus, CellAtria provides computational skill-agnostic and time-efficient access to standardized single-cell data ingestion and analysis.