A Trust-Weighted Hybrid Federated Fine-Tuning Architecture for Retrieval-Augmented Large Language Models

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Federated fine-tuning for retrieval-augmented language models enables collaboration across data silos without sharing raw corpora, but it is vulnerable to unreliable clients whose retrieval corpora degrade retrieval behavior and, consequently, the quality of augmented inputs. This paper presents a label-free trust-weighted aggregation framework that estimates client reliability from retrieval-consistency signals and uses the resulting trust scores to control each client’s influence on the global update. For a fixed probe prompt set, each client constructs a retrieval-distribution histogram from top-$K$ retrieved items, and Jensen--Shannon divergence to a round-wise global reference histogram quantifies retrieval deviation. An exponential mapping and exponential moving average smoothing produce stable trust trajectories over federated rounds, and normalized trust scores define aggregation weights with an explicit sum-to-one constraint. Experiments with four clients under a controlled corpus-poisoning protocol show that benign clients maintain consistently high trust while the poisoned client converges to a lower trust regime and is slightly but consistently down-weighted. All reported curves and summary statistics are derived from logged artifacts and deterministic scripts without manual post-processing, supporting full reproducibility of the trust and weighting dynamics.

Article activity feed