Auditor Models to Suppress Poor AI Predictions Can Improve Human-AI Collaborative Performance

Katherine E. Brown
Jesse O. Wrenn
Nicholas J. Jackson
Michael R. Cauley
Benjamin Collins
Laurie Lovett Novak
Bradley A. Malin
Jessica S. Ancker

Read the full article

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Objective

Healthcare decisions are increasingly made with the assistance of machine learning (ML). ML has been known to have unfairness – inconsistent outcomes across subpopulations. Clinicians interacting with these systems can perpetuate such unfairness by overreliance. Recent work exploring ML suppression – silencing predictions based on auditing the ML – shows promise in mitigating performance issues originating from overreliance. This study aims to evaluate the impact of suppression on collaboration fairness and evaluate ML uncertainty as desiderata to audit the ML.

Materials and Methods

We used data from the Vanderbilt University Medical Center electronic health record (n = 58,817) and the MIMIC-IV-ED dataset (n = 363,145) to predict likelihood of death or ICU transfer and likelihood of 30-day readmission. Our simulation study used gradient-boosted trees as well as an artificially high-performing oracle model. We derived clinician decisions directly from the dataset and simulated clinician acceptance of ML predictions based on previous empirical work on acceptance of CDS alerts. We measured performance as area under the receiver operating characteristic curve and algorithmic fairness using absolute averaged odds difference.

Results

When the ML outperforms humans, suppression outperforms the human alone (p < 0.034) and at least does not degrade fairness. When the human outperforms the ML, suppression outperforms the human (p < 5.2 × 10 ^-5 ) but the human is fairer than suppression (p < 0.0019). Finally, incorporating uncertainty quantification into suppression approaches can improve performance.

Conclusion

Suppression of poor-quality ML predictions through an auditor model shows promise in improving collaborative human-AI performance and fairness.

Version published to 10.1101/2025.06.24.25330212v1 on medRxiv
Jun 24, 2025

A time-sequenced approach to machine learning prognostic modelling with implementation on running-related injury prediction

This article has 7 authors:
1. Han Wu
2. Katherine Brooke-Wavell
3. Michael R. Barnes
4. Zainab Awan
5. Sarabjit Mastana
6. Sam Allen
7. Richard C. Blagrove
This article has no evaluationsLatest version May 27, 2025
Enhancing Mental Health Decision-Making with Artificial Intelligence/Machine Learning: A Prescriptive Analytics Approach for Customised Outcomes

This article has 7 authors:
1. Mark Payne
2. Fareed Ud Din
3. Kabir Sattarshetty
4. Cassandra Sundaraja
5. Anwaar Ul-Haq
6. Theresa Scott
7. Niusha Shafiabady
This article has no evaluationsLatest version Jun 17, 2025
Using machine learning to identify subgroups with the highest expected benefit in a population-based water, sanitation, handwashing, and nutrition intervention

This article has 10 authors:
1. Caitlin Hemlock
2. Laura H. Kwong
3. Lia C.H. Fernald
4. Alan E. Hubbard
5. John M. Colford
6. Fahmida Tofail
7. Md. Mahbubur Rahman
8. Sarker Parvez
9. Stephen P. Luby
10. Andrew N. Mertens
This article has no evaluationsLatest version Jun 18, 2025

Listed in

Abstract

Objective

Materials and Methods

Results

Conclusion

Article activity feed

Related articles

A time-sequenced approach to machine learning prognostic modelling with implementation on running-related injury prediction

Enhancing Mental Health Decision-Making with Artificial Intelligence/Machine Learning: A Prescriptive Analytics Approach for Customised Outcomes

Using machine learning to identify subgroups with the highest expected benefit in a population-based water, sanitation, handwashing, and nutrition intervention