Developing ReAcT: Methodological Foundations of a FAIR Data Repository for HIV Treatment in Brazil
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Background: Data fragmentation and limited interoperability severely challenge the strategic use of public health information, particularly for HIV/AIDS care in Brazil, hindering the comprehensive analysis of the care continuum. Aligning these heterogeneous national datasets with the FAIR Principles (Findable, Accessible, Interoperable, Reusable) is crucial but faces practical implementation gaps. This study outlines the methodological foundation for developing the Repository of Access to HIV Treatment (ReAcT). Methods: ReAcT was developed using a FAIR-aligned approach to data curation and harmonization. The methodology involved the integration of secondary data from multiple, independent national health information systems, specifically SINAN, SICLOM, SISCEL, SIA, and SIM. The core process consisted of data standardization, quality control, and deterministic or probabilistic record linkage to reconstruct patient trajectories. The entire data pipeline was operationalized using a reproducible, open-source R-language function to ensure analytical precision and consistency. Results: The ReAcT repository successfully consolidated approximately 1.3 million records from these disparate systems, creating a structured, research-ready database. This methodology enabled the structured cross-referencing of information (e.g., linking case notifications with ART initiation and monitoring) at the municipal level, allowing for the identification of patterns suggesting real gaps in care continuity. Conclusion: The ReAcT framework demonstrates the feasibility of building a FAIR-aligned data repository by overcoming data fragmentation, providing a reliable foundation to identify care gaps and support strategic decision-making in the Brazilian Unified Health System (SUS). The proposed methodology is highly scalable to other chronic diseases (such as tuberculosis or diabetes) and is adaptable to similar public health challenges in low- and middle-income country contexts.