Large Language Models for Sentiment Analysis in Healthcare: A Systematic Review Protocol

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Large language models (LLMs) have emerged as powerful tools for sentiment analysis in healthcare, offering potential advantages in capturing contextual information and semantic relationships in complex medical text. Healthcare sentiment analysis presents unique challenges due to domain-specific terminology, privacy regulations, and the nuanced nature of patient experiences. This systematic review protocol outlines a comprehensive methodology to investigate the application of LLMs for sentiment analysis across healthcare settings, including analysis of patient feedback, social media content, and electronic health records. By synthesizing current evidence, we aim to provide insights for researchers, clinicians, and policymakers on the effectiveness, limitations, and ethical considerations of these advanced natural language processing techniques.

We will conduct a systematic review following PRISMA-P 2015 guidelines and using the PICOS framework. The search strategy will encompass eight major databases (PubMed, Web of Science, Embase, CINAHL, MEDLINE, The Cochrane Library, PsycINFO, and Scopus) using a comprehensive search string combining terms related to LLMs, sentiment analysis, and healthcare contexts. We will include peer-reviewed studies published between 2018 (corresponding to BERT’s introduction) and March 2025 that focus on LLM applications for healthcare sentiment analysis with reported performance metrics or qualitative evaluations. Two independent reviewers will screen titles/abstracts and full texts, with disagreements resolved through discussion or third-reviewer consultation. Data extraction will capture study characteristics, research objectives, dataset details, LLM architecture specifications, fine-tuning approaches, performance metrics, and implementation challenges. Quality assessment will employ a modified QUADAS-2 tool and the Cochrane Risk of Bias tool. We will conduct narrative synthesis of the findings, organizing them thematically according to our research questions, with meta-analysis performed if study heterogeneity permits.

PROSPERO registration number

CRD420251012298

Strengths and limitations of this study

  • This is the first systematic review to comprehensively examine large language models for sentiment analysis specifically within healthcare contexts, addressing a significant gap in the literature.

  • The review’s rigorous methodology follows PRISMA-P guidelines and employs dual independent screening, data extraction, and quality assessment to ensure thoroughness and minimize bias.

  • The inclusion of diverse healthcare text sources (patient feedback, social media, electronic health records) allows for a comprehensive understanding of LLM applications across the healthcare information ecosystem.

  • By focusing on studies published since 2018 (when BERT was introduced), the review captures the most relevant technological developments while excluding outdated approaches.

  • A limitation of this study is the expected heterogeneity across included studies (varying LLM architectures, datasets, metrics, and implementation contexts), which may preclude meaningful meta-analysis and limit definitive conclusions about relative performance, resulting in more descriptive than prescriptive findings.

Article activity feed