Suppressing echo cascades in language-model agents with multi-critic plan selection

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Language-model agents act across many turns, where unsupported claims persist when downstream turns endorse and repeat them. We decompose propagation into injection, endorsement and repetition, and propose CascadeGuard, a multi-critic planner separating proposal from selection. Lightweight critics score candidates for utility, norm compliance and epistemic support under a fixed token budget (15,630 tokens/step). Across collaboration, dialogue and retrieval tasks, CascadeGuard improves success by 12–13 percentage points while reducing norm violations by 69–77% and propagation by 78%. Mechanism interventions show endorsement suppression yields 2.19× larger propagation reduction than injection blocking—validated independently by an automatic detector and human annotators using a pragmatic endorsement taxonomy (n=18 annotators, 840 items; effect ratio 2.34, 95% CI: 1.91, 2.84). External validation on held-out benchmarks (MultiWOZ-Conflict, DSTC11-InfoSeek) confirms gains persist without retuning (+19.2 pp AA, −-8.1 pp unsupported). Human validation (n= 624; Krippendorff’s α=0.843) confirms strong metric–severity correlation (ρ=0.88), indicating that automatic propagation metrics reflect genuine epistemic harm.

Article activity feed