Procedural Guideline Execution Training Improves LLM Performance in Rule-Based Clinical Tasks

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Large Language Models (LLMs) offer transformative potential for Clinical Decision Support (CDS) by processing complex medical information and generating actionable insights. However, ensuring their reliability, strict adherence to Clinical Practice Guidelines (CPGs), and interpretability remains a critical challenge for safe clinical deployment. Existing methods, such as prompting strategies and integrating external CPG structures, provide guidance but do not intrinsically train the LLM to execute procedural guideline logic. To address this gap, we propose Procedural Guideline Execution Training (PGET), a novel fine-tuning approach that trains LLMs to generate step-by-step execution traces explicitly demonstrating the application of CPG rules to a patient scenario. We evaluate PGET using diverse LLMs on a synthetic dataset for COVID-19 outpatient treatment, comparing it against Zero-Shot Prompting and established CPG integration methods like Binary Decision Tree integration. Our experiments, leveraging both automatic CPG adherence metrics and expert human evaluation, demonstrate that PGET significantly outperforms comparison methods in achieving higher CPG adherence, clinical accuracy, and interpretability. The generated execution trace provides valuable transparency, fostering trust in the model's recommendations. PGET offers a promising path towards building more reliable, transparent, and guideline-compliant AI systems for clinical decision support.

Article activity feed