dsTidyverse: An implementation of Tidyverse within the DataSHIELD ecosystem
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
This paper introduces dsTidyverse, an R package designed to enhance data handling within the federated analysis platform DataSHIELD. DataSHIELD enables multi-site analysis without direct data sharing, crucial for privacy-sensitive research. While DataSHIELD facilitates complex analysis, it lacks user-friendly data manipulation tools. dsTidyverse addresses this by implementing selected functions from the “Tidyverse” ecosystem within DataSHIELD’s client-server architecture. The package provides functionality for selecting, renaming, and creating columns; conditional recoding; combining data frames; filtering rows; grouping data; and converting to tibbles. Rigorous disclosure checks are implemented to prevent individual-level data leakage. The paper demonstrates, through examples, how dsTidyverse simplifies common data manipulation tasks, improving user experience and analysis efficiency within DataSHIELD. The package is open-source, freely available on CRAN and GitHub, welcoming further development. See https://github.com/molgenis/ds-tidyverse