A Randomized-Clinical Trial of Two Ambient Artificial Intelligence Scribes: Measuring Documentation Efficiency and Physician Burnout

Paul J. Lukac
William Turner
Sitaram Vangala
Aaron T. Chin
Joshua Khalili
Ya-Chen Tina Shih
Catherine Sarkisian
Eric M. Cheng
John N. Mafi

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Importance

Ambient artificial intelligence (AI) scribes record patient encounters and generate visit notes almost instantaneously, representing a promising solution to documentation burden and associated physician burnout. Despite swift and widespread adoption of AI scribes, their impacts have not been examined in randomized-clinical trials.

Objective

To test the effectiveness of two AI scribes in reducing time spent writing notes and associated burnout in a randomized-clinical trial.

Design

Parallel three-arm pragmatic randomized-clinical trial where physicians were assigned 1:1:1 via covariate-constrained randomization (balancing on time-in-note, baseline burnout score, and clinic days /week) to either one of two AI scribe applications—Microsoft DAX or Nabla—or a usual-care control group from 11/4/2024-1/3/2025.

Setting

A large academic health system in California.

Participants

313 outpatient physicians were recruited based on leadership referrals and department-wide emails. 238 participants representing 14 specialties qualified.

Intervention

Intervention-arm physicians gained access to an AI scribe for two months.

Main Outcomes and Measures

The primary outcome was change from baseline log writing time-in-note. Secondary outcomes measured by surveys included Mini-Z 2.0, 4-item physician task load (TL), and Professional Fulfillment Index-Work Exhaustion (PFI-WE) scores to evaluate aspects of burnout, work environment, and stress, as well as targeted questions addressing safety and accuracy.

Results

DAX was used in 33.5% of 24,696 visits; Nabla was used in 29.5% of 23,653 visits. Nabla users experienced a 9.5% [95% CI:-17.2%,-1.8%] (p=.02) decrease in time-in-note versus the control group and a 7.8% [-15.5%,-0.1%] (p=.05) decrease versus DAX users, while DAX users exhibited no significant change versus control (-1.7% [-9.4%,+5.9%]; p=.66). Total Mini-Z, scaled 10-50 with higher scores indicating improvement, increased with users of any scribe (+2.76 [+1.41,+4.10]; p<.001). Reductions in TL (scale 0-400, TL=-35.8 [-63.7,-7.9]; p=.01) and work exhaustion (scale 0-4, PFI-WE=-0.27 [-0.48,-0.07]; p=.01) were seen with users of any scribe. One Grade 1 (mild) adverse event was reported, while clinically-significant inaccuracies were noted “occasionally” on 5-point Likert questions (DAX 2.7 [2.4-3.0] vs. Nabla 2.8 [2.6-3.0]; p=.68).

Conclusion and Relevance

Use of Nabla reduced time-in-note, while use of any scribe led to modest improvements in physician burnout, work exhaustion, and task load. Performance was remarkably similar across two distinct vendor platforms, and occasional inaccuracies observed in either scribe require ongoing physician vigilance.

Trial Registration

ClinicalTrials.gov Identifier: NCT06792890

Version published to 10.1101/2025.07.10.25331333 on medRxiv
Jul 11, 2025

The SMART-OR Framework for Implementing Artificial Intelligence in the Operating Room

This article has 10 authors:
1. Hillary Lia
2. Divya Kewalramani
3. Muhammad Uzair Khalid
4. Justin Benton
5. Caterina Masino
6. Rachel L. Choron
7. Tyler J. Loftus
8. Mayur Narayan
9. Wagner H. Souza
10. Amin Madani
This article has no evaluationsLatest version Dec 23, 2025
The Clinician in the Loop: How Multimodal AI Affects Clinical Decision-Making in Mental Healthcare

This article has 7 authors:
1. Jack Noto
2. Om Panda
3. Chioma Uka
4. Omid Tabatabaee
5. Nathan Carroll
6. Daniel Weiner
7. Charles Binkley
This article has no evaluationsLatest version Dec 11, 2025
Appropriateness and Utility of a Clinical Decision Support System at the Digital Front Door

This article has 11 authors:
1. Andreia Pimenta
2. Nisha Kini
3. Fabienne Cotte
4. Filipa Dias Lourenço
5. Miguel Paiva Pereira
6. Marcel Schmude
7. Athena Lemesiou
8. Stephen Gilbert
9. Tauseef Mehrali
10. Micaela Seemann Monteiro
11. Pedro Flores
This article has no evaluationsLatest version Jan 8, 2026