Semantic Interoperability at National Scale: The SPHN Federated Clinical Routine Dataset
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Over the past eight years, the Swiss Personalized Health Network (SPHN) has established a national federated framework enabling semantically interoperable health-related data, with a primary focus on hospital clinical routine data. Rather than centralizing patient-level information, hospitals locally perform semantic coding and standardization and store SPHN-compliant data in dedicated triple stores. To promote discoverability, descriptive metadata and summary statistics derived from these local datasets are then centralized in the SPHN Metadata Catalog, which follows the SPHN Metadata Catalog Schema and aligns with European Health Data Space metadata standards. As of 2025, the SPHN Federated Clinical Routine Dataset encompasses information from more than 800,000 patients who provided broad consent, covering the period from 2018 to present. Across the first six participating hospitals, the infrastructure holds over 12.5 billion (10^9) RDF triples mapped to 125 SPHN semantic concepts including demographics, diagnoses, procedures, medications, laboratory results, vital signs, clinical scores, allergies, microbiology, intensive care data, oncology, and biological samples. This federated approach ensures that health data remain FAIR (Findable, Accessible, Interoperable, and Reusable) while safeguarding patient privacy by avoiding centralizing information. In this paper, we present the design, implementation, and scope of the SPHN Federated Clinical Routine Dataset, and its role in supporting data discoverability for research and clinical applications.