Registry Forge: an open-source end-to-end pipeline for patient-directed SMART on FHIR registries

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Objectives: Patient-directed SMART on FHIR lets registries acquire longitudinal electronic health record data, but the payload requires substantial engineering before use. We present Registry Forge, an open-source pipeline that converts it into research-ready outputs. Materials and Methods: Registry Forge decodes and parses mixed C-CDA, HTML, RTF, PDF, and FHIR inputs, joins records to a canonical patient identifier, and emits a browser-viewable dashboard, an OMOP CDM v5.4 data set, GA4GH Phenopackets v2, a code inventory, and regex extractions of disease-specific narrative content. Results: Applied to the ALS Research Collaborative Study (94 participants, 56 US health systems), it processed 22,686 source files and 1,791 FHIR Bundles (109,599 resources); only 15.0% of files were full C-CDA. Discussion: This pipeline generalizes to any registry acquiring data through patient-directed SMART on FHIR. Conclusion: Registry Forge closes the acquisition-to-analysis gap with no server infrastructure and is openly available.

Article activity feed