Harnessing Human Uncertainty to Train More Accurate and Aligned AI Systems

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

AI-augmented decision-making (AIADM) aims to leverage the computational power of machine learning (ML) models to assist humans in their decision-making processes. In many such systems, especially for complex tasks like medical image classification, ML models are often trained on large datasets annotated by humans. Neglecting to account for human decision-making biases when constructing these labeled datasets can lead to biased datasets, and subsequently models trained on such datasets can inherit the biases. We propose a novel approach to developing AIADM systems that aims to overcome these challenges by harnessing human uncertainty. Our approach has three elements: we collect subjective judgments from human annotators, we calibrate those subjective judgments, and we use the recalibrated subjective judgments to create probabilistic (i.e., soft) labels, which the AI decision aid is then trained on. We evaluate our methods through two studies using data from DiagnosUs, a crowdsourcing platform for medical image annotation. Across multiple training datasets, we assess how our proposed methods impact three key properties of AI decision aids: accuracy, calibration, and alignment with human uncertainty. We refer to these properties as the AIADM tri-criteria. Our results show that ML models trained on recalibrated soft labels are more accurate and better aligned with expert judgments. We also observe a tradeoff between ML calibration and alignment with human uncertainty. These findings highlight the value of capturing and correcting human uncertainty in ML training data and the need to consider the tri-criteria when developing AI systems.

Article activity feed