Benchmarking Artificial Intelligence vs General Practitioners Decision-Making in Same-Day Appointments Triage: A Mixed-Methods Study in UK Primary Care
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Background
Artificial intelligence (AI) is increasingly used to support clinical decision-making, particularly in primary care triage. However, few studies have benchmarked AI triage tools against general practitioner (GP) assessments in real-world settings. This study evaluated the agreement between an AI-enabled triage tool (Visiba Triage) and GP urgency ratings for same-day appointment requests. Secondary aims included assessing perceptions of safety, accuracy and usability from both clinician and patient perspectives.
Methods
A mixed-methods study was conducted using data from patients requesting SDA between January and June 2024. Urgency scores generated by the Visiba Triage AI tool based on a modified Manchester Triage System were compared to GP-assigned ratings using Spearman’s rank correlation and Cohen’s kappa. Ordinal logistic regression assessed associations between demographics and patient satisfaction. Thematic analysis of interviews with eight GPs explored perceptions of the AI tool’s performance.
Results
A total of 649 participants were included in this study. The majority were females and of White ethnicity. There was a strong correlation between AI and GP urgency ratings (ρ=0.796, p 0.001), with 83.7% categorical agreement across eight urgency levels (κ 0.69, p 0.001). The AI system demonstrated safety-conscious design, with a greater likelihood of over-triage whilst rarely under-triaging. No cases deemed non-urgent by AI were later reclassified as emergencies by GPs. Qualitative findings supported the quantitative results, highlighting perceived accuracy and safety. Current limitations include suboptimal integration with patient medical records. Patient satisfaction varied significantly by age, with older adults (60+) reporting lower satisfaction (aOR 0.25, 95% CI 0.12-0.52).
Conclusion
This study demonstrates that AI-enabled triage can closely mirror clinical judgement in a primary care setting, offering a safe, scalable solution to manage demand for same-day care. Safe adoption of AI triage tools in healthcare should include real-world assessment and benchmarking against consensus clinician judgement in real-time.
Key Takeaways
-
AI-enabled triage tools can achieve substantial agreement with GP urgency assessments, with 84% categorical concordance and no observed cases of significant under-triage.
-
The AI model demonstrated a safety-conscious design, favouring over-triage to reduce patient safety risks, especially in emergency scenarios.
-
Older adults reported significantly lower satisfaction with AI triage, highlighting the need to address digital literacy and inclusion when implementing such tools.
-
GPs expressed high confidence in AI performance at acuity extremes, particularly for self-care and emergency cases, though noted contextual limitations without EHR integration.
-
This real-world study highlights the potential of AI triage to enhance clinical efficiency, particularly in managing same-day appointment demand in overstretched systems like the NHS.
-
Ongoing clinician oversight remains essential to mitigate AI limitations in complex cases and ensure equitable, safe deployment at scale.