A Practical Implementation to Federated Learning: Detecting Backdoor Attack on Next- word Prediction Model
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
This article presents the impact of attacking a next-word prediction model built from federated learning and proposes a backdoor attack detection mechanism with an example from presidential election. The attacker can affect the model by controlling a portion of machines joining federated learning so the perception toward a particular subject can be changed, for example, presidential election. Different ratios of datasets participating in federated learning were tested to investigate the correlation between the percentage of negative datasets joined and the bad feeling toward the sentences. The detection mechanism was also investigated to determine whether the devices with deviated and abnormal datasets can be found and banned from joining the training pool to reduce the attack toward the model. The result proves the hypothesis of the positive relation between the percentage of bad devices in the training pool and the bad perception of the sentences generated by the model toward a particular subject. The proposed mechanism can help reduce the impact of attacking when the percentage of the connected bad devices is small.