A Reinforcement Learning-based Approach for Dynamic Privacy Protection in Genomic Data Sharing Beacons
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
The rise of genomic sequencing has led to significant privacy concerns due to the sensitive and identifiable nature of genomic data. The Beacon Project, initiated by the Global Alliance for Genomics and Health (GA4GH), was designed to enable privacy-preserving sharing of genomic information via an online querying system. However, studies have revealed that the protocol is vulnerable to membership inference attacks, which can expose the presence of individuals in sensitive datasets. Various countermeasures, such as noise addition and query restrictions, have been proposed but are limited by static implementation, leaving them prone to attackers that can adapt and change strategies. In this study, we present the first reinforcement learning (RL)-based approach for dynamic privacy protection of the beacon protocol. We employ a multi-player RL setting where we train (i) a “Generic-Beacon-Defender” agent who can adjust the honesty rate of its responses, against (ii) a “Generic-Beacon-Attacker” agent who can choose the order of the queries and ask random queries to make the beacon think it is a regular user. This is the first defense mechanism capable of adapting its strategy in real time based on user queries, distinguishing between legitimate users and potential attackers, and applying tailored policies accordingly. By doing so, this method enhances both privacy and utility, effectively countering sophisticated and evolving threats. The code and the models are available at github.com/ciceklab/beacon-defense-strategies .