scMarkerAgent: An LLM Evidence Agent-based Cell Marker Atlas

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Evidence-augmented and reliable cell-type annotation remains a major bottleneck in single-cell RNA-seq analysis, particularly for rare, transitional, and disease-associated populations. To address this, we introduce scMarkerAgent, an evidence-grounded cell marker resource developed using an LLM-assisted literature-curation framework. It integrates 294,692 full-text publications to provide 890,296 high-quality cell type–marker annotations from 50,233 cell types across human, mouse, and rat. scMarkerAgent integrates 82,165 curated negative-marker annotations and 417,812 disease-context annotations, improving disambiguation of homologous cell types and delineation of malignant cells. Every cell type–marker annotation is directly supported by sentence-level literature evidence. In the cell annotation workflow, candidate labels are further refined through an LLM-based reasoning step that jointly evaluates positive and negative markers. Compared with existing resources, scMarkerAgent offers broader coverage of markers, tissues, cell types, and diseases. It is released as a FAIR-compliant database together with a code-free web platform that supports marker retrieval, automated cell annotation, and customizable cell scoring (available at https://www.markeragent.net).

Article activity feed