Fixing the plumbing: Building interoperability between wastewater genomic surveillance datasets and systems using the PHA4GE contextual data specification
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
The evolution of wastewater genomic surveillance (WWGS) has led to the development of many new methodologies, allowing for the broad application of WWGS for detection and monitoring of diverse pathogens and genetic markers. Variability in techniques and approaches creates challenges for data integration and interoperability that hinder analyses necessary for public health insights. Here, the Public Health Alliance for Genomic Epidemiology (PHA4GE) – in collaboration with scientists and stakeholders from over 20 countries, as well as global data repositories – presents a wastewater contextual data specification package relevant for a wide array of public health and research use cases. The PHA4GE wastewater contextual data specification is an ISO-compatible, ontology-based, modular data standard that is implemented by a free, open source data curation and validation tool called the DataHarmonizer. To facilitate interoperability and data sharing, interchange formats and instructions for automated transformations are included among the package’s supporting documentation. The specification package is part of a growing library of interoperable pathogen/target-specific standards designed upon a shared framework using semantic best practices. We hope that this standard will not only aid in the implementation of WWS, but also serve as an exemplar for the development of related data standards, such as for other environmental use cases or other metagenomic surveillance efforts.