Experimental evidence that delegating to intelligent machines can increase dishonest behaviour

Nils Köbis
Zoe Rahwan
Clara Bersch
Tamer Ajaj
Jean-François Bonnefon
Iyad Rahwan
Bramantyo Supriyatno
Raluca Rilla

Read the full article

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

While artificial intelligence enables productivity gains from delegating tasks to machines, it may facilitate the delegation of unethical behaviour. Here we demonstrate this risk by having human principals instruct machine agents to perform tasks with incentives to cheat. Requests for cheating increased when principals could induce machine dishonesty without telling the machine what to do, through supervised learning or high-level goal-setting. These effects held whether delegation was voluntary or mandatory. We also examined delegation via natural language to large language models. While principals' cheating requests were not always higher for machine agents, compliance diverged sharply: Machines were far more likely than human agents to carry out unethical instructions. This compliance could be curbed with the injection of prohibitive, task-specific guardrails. Our results highlight ethical risks in the context of increasingly accessible and powerful machine delegation, and suggest design and policy strategies to mitigate them.

Version published to 10.31219/osf.io/dnjgz_v2 on OSF Preprints
May 6, 2025
Version published to 10.31219/osf.io/dnjgz on OSF Preprints
Oct 4, 2024

Listed in

Abstract

Article activity feed