Automated Identification of Contextually Relevant Biomedical Entities with Grounded LLMs

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

This study investigates the effectiveness of different large language models (LLMs) for automated biomedical entity annotation in research articles with a focus on contextualized and grounded results. A 4-step generative workflow iteratively generates and refines entity candidates by considering a metadata schema for context and agentic tool use for validation with the PubTator 3 data base. The precision of this flow was assessed with a random effects meta-analysis after face-to-face interviews with authors of six papers from the Collaborative Research Center (CRC) 1453 “NephGen”. With an overall precision of 91.3%, the selected models provide qualitatively valuable annotations, with models GPT 4.1, GPT-4o Mini, and Gemini 2.0 Flash showing the highest precision. While GPT 4.1 and Gemini 2.0 Flash excelled in the total number of correct annotations, GPT-4o Mini and Gemini 2.0 Flash were fastest and most cost-effective. Large variations in annotation count and the conflation of publication and dataset-specific annotations highlight that human review (“human-in-the-loop”) is still important. The results further highlight the trade-offs between precision, total number of correct annotations, cost, and speed. While quality is paramount in collaborative research settings, cost-effectiveness could be more critical in public implementations.

Article activity feed