A Trust-Weighted Hybrid Federated Fine-Tuning Architecture for Retrieval-Augmented Large Language Models

Parsa Abdi

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Federated fine-tuning for retrieval-augmented language models enables collaboration across data silos without sharing raw corpora, but it is vulnerable to unreliable clients whose retrieval corpora degrade retrieval behavior and, consequently, the quality of augmented inputs. This paper presents a label-free trust-weighted aggregation framework that estimates client reliability from retrieval-consistency signals and uses the resulting trust scores to control each client’s influence on the global update. For a fixed probe prompt set, each client constructs a retrieval-distribution histogram from top-$K$ retrieved items, and Jensen--Shannon divergence to a round-wise global reference histogram quantifies retrieval deviation. An exponential mapping and exponential moving average smoothing produce stable trust trajectories over federated rounds, and normalized trust scores define aggregation weights with an explicit sum-to-one constraint. Experiments with four clients under a controlled corpus-poisoning protocol show that benign clients maintain consistently high trust while the poisoned client converges to a lower trust regime and is slightly but consistently down-weighted. All reported curves and summary statistics are derived from logged artifacts and deterministic scripts without manual post-processing, supporting full reproducibility of the trust and weighting dynamics.

Version published to 10.21203/rs.3.rs-9093774/v1 on Research Square
Mar 12, 2026

From Noisy Feedback to Trustworthy IssueSpecifications: An Agent-GovernedRetrieval-Augmented Generation Approach

This article has 4 authors:
1. Zhiyao Wang
2. Jialong Li
3. Xiujing Guo
4. Tatsuhiro Tsuchiya
This article has no evaluationsLatest version Mar 4, 2026
AionRAG: Time-Correct Retrieval-Augmented Generation under Knowledge Drift

This article has 5 authors:
1. Rui Li
2. Shuang Cao
3. Ruihua Liu
4. Alexandre Duprey
5. Angel Dong
This article has no evaluationsLatest version Feb 20, 2026
Quantifying the relative impact of embedding models and system architecture on first stage dense retrieval performance

This article has 1 author:
1. Eliseo Curcio
This article has no evaluationsLatest version Mar 17, 2026

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

From Noisy Feedback to Trustworthy IssueSpecifications: An Agent-GovernedRetrieval-Augmented Generation Approach

AionRAG: Time-Correct Retrieval-Augmented Generation under Knowledge Drift

Quantifying the relative impact of embedding models and system architecture on first stage dense retrieval performance