Understanding the Influence of Design-Related Factors on Human-AI Teaming in a Face Matching Task
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
There is a steep rise in the use of decision aids enabled by artificial intelligence (AI) for facial identity verification. The use of such systems and the impact of their implementation on human decision-making is still not well understood. The current study aimed to explore factors associated with the design of the paradigm and of the presentation of predictions from AI systems. Across three pre-registered experiments, we examined the impact of (a) implied AI accuracy, (b) mismatch frequency (i.e. proportion of match and mismatch pairs), and (c) advice type (binary only vs. binary + similarity rating) on performance in a one-to-one face matching task. Participants’ performance improved when aided by AI compared to a baseline without decision support. The largest improvement was observed when no information on the AI’s overall accuracy was provided. Further, the frequency of mismatches did not influence performance. Finally, similarity ratings marginally improved overall performance and increased users’ certainty in their decisions, but did not help participants to dismiss inaccurate predictions. Additionally, two findings were consistent across all experiments. First, participants often failed to dismiss inaccurate AI predictions, resulting in significantly lower performance accuracy compared to the accurate predictions condition. Second, on a group level, the human-AI team did not outperform the AI alone, though examination of individual performance showed that some participants were able to exceed the AI’s accuracy. These findings contribute towards determining appropriate design formats for AI prediction in a human-in-the-loop system, so that the performance of the human-AI team can be maximised.