Quantifying Claim Robustness Through Adversarial Framing: An AI-Enabled Diagnostic Tool

Christophe Faugere

Read the full article

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

This article introduces the Adversarial Claim Robustness Diagnostics (ACRD) protocol, a novel conceptual framework for assessing how factual claims withstand ideological distortion. Building on Tarski's (1944) semantic theory, contemporary work in cultural cognition (Kahan, 2017), adversarial collaboration (Ceci et al., 2024) and the Devil’s Advocate Approach (Vrij et al., 2023), we develop a three-phase evaluation process combining baseline evaluations, adversarial speaker reframing, dynamic calibration, and quantified robustness scoring. We model the evaluation of claims by ideologically opposed groups as a strategic game with a Bayesian Nash equilibrium, to infer what the possible behavior of evaluators might be after the adversarial collaboration phase. The ACRD addresses shortcomings in traditional fact-checking identified by Nyhan and Reifler (2010), and employs large language models (Argyle et al., 2023) to simulate counterfactual attributions while mitigating potential biases (Zhang et al., 2018; González-Sendino et al., 2024). Examples of yet-to-be-explored potential applications range from climate change issues to trade policy discourses to demonstrate the framework's ability to identify boundary conditions of persuasive validity across polarized groups.

Version published to 10.20944/preprints202505.0018.v1
May 2, 2025

Adaptive Dynamic Fusion for Adversarial and Counterfactual Debiasing in Pre-Trained Language Models

This article has 3 authors:
1. Yassine Yazidi
2. Hamid Garmani
3. Mohamed Baslam
This article has no evaluationsLatest version Apr 14, 2025
Assisting the Social Scientist through Human-Machine Consensus: Causal Discovery Approach to Unravel Cyberbullying

This article has 7 authors:
1. Andrea Baños-Ramos
2. María Reneses
3. Jaime Pérez
4. Gabriel Valverde
5. Edmond Awad
6. Gregorio López López
7. Mario Castro
This article has no evaluationsLatest version Apr 28, 2025
Explainability-Driven Adversarial Robustness Assessment for Generalized Deepfake Detectors

This article has 3 authors:
1. Lorenzo Cirillo
2. Andrea Gervasio
3. Irene Amerini
This article has no evaluationsLatest version Apr 30, 2025

Listed in

Abstract

Article activity feed

Related articles

Adaptive Dynamic Fusion for Adversarial and Counterfactual Debiasing in Pre-Trained Language Models

Assisting the Social Scientist through Human-Machine Consensus: Causal Discovery Approach to Unravel Cyberbullying

Explainability-Driven Adversarial Robustness Assessment for Generalized Deepfake Detectors