LLM-based cell type annotation harmonization across single-cell studies using GCTHarmony
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
A major challenge in integrating previously analyzed single-cell RNA-seq studies is the inconsistency of cell type annotations. To address this, we developed GCTHarmony, an LLM-based method for harmonizing cell type annotations across single-cell studies. Utilizing OpenAI's text embedding model, GCTHarmony accurately maps arbitrary cell type annotations to standardized cell ontology terms and reconciles discrepancies in annotation hierarchies across studies. In a real data example, we show that GCTHarmony substantially improves the consistency of cell type annotations across single-cell studies.