When silence is safer: a review of LLM abstention in healthcare

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Large language models (LLMs) are designed to generate answers to user prompts, which often drives them to respond even when uncertainty is high, information is incomplete, or a refusal would be more appropriate. In healthcare, this tendency can be dangerous: confidently stated, but inaccurate medical advice can cause significant harm, making the ability to abstain especially important. In this paper, we review studies that investigate LLM abstention behaviors in healthcare. The literature reflects two main motivations: (1) uncertainty-driven abstention, in which the model withholds a response when confidence is low, and (2) safety-driven abstention, in which the model declines to provide information that could lead to harm. Most abstention mechanisms are extrinsic approaches that rely on auxiliary tools to decide when to abstain. We find that state-of-the-art LLMs continue to struggle with refusing inappropriate prompts and that few benchmarks address abstention in real-world medical scenarios, where performance lags behind other domains. To support future research, we introduce a conceptual abstention evaluation framework accompanied by a proof-of-concept implementation. We conclude with a taxonomy of observed refusal patterns and a decision-theoretic formalization of abstention.

Article activity feed