Creating a Research-Ready Data Asset version of primary care data for Wales and investigating the impact of COVID-19 on utilisation of primary care services

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Objectives

We developed an efficient Research-Ready Data Asset (RRDA) for the Welsh Longitudinal General Practice (WLGP) data within the Secure Anonymised Information Linkage Databank to standardise curation, enhance reproducibility, and facilitate research on primary care trends. Using this, we investigated primary care activity trends during and after the COVID-19 pandemic.

Methods

The RRDA involves cleaning, curation using GP-registration history, and transforming data into a structured, normalised format to support efficient large-scale queries. A comprehensive clinical code look-up was developed, incorporating official, local, and supplementary categories to enhance event classification. To enable patient-practice interaction analysis, a four-layer approach was developed to capture healthcare providers, access mode, interaction type, and event details.

We assessed RRDA coverage, defined as the proportion of residents with shared primary care records, stratified by demographic and geographic factors, using longitudinal binomial Generalised Additive Mixed Models (GAMMs).

We categorised GP events into key activity types and summarised averaged daily rates per month per 100,000 people (2000-2024), with trends analysed using negative binomial GAMMs.

Results

Curating 4.6 billion records for 5.1 million people (1990-2024) revealed significant improvements in data quality and completeness over time, with data retention increased from 40% to 94%, and patient inclusion from 43% to 98%. Use of SNOMED and local codes increased after Read-V2 discontinuation in 2018, while invalid codes declined—reflecting evolving coding practices and improved data quality.

WLGP RRDA coverage rose from 35% in 1990 to 86% in 2024, with regional variation but modest demographic differences. From 2000 to 2024, consultation rates rose by 1.9 times, with post-COVID-19 pandemic levels 8% above 2019. Prescription-only activity doubled with little variation associated with the pandemic. Vaccination rates spiked during the pandemic, and remain 1.8 times above pre-pandemic levels. Other less frequent activities were significantly disrupted during the COVID-19 pandemic but recovered to 2019 levels.

Conclusions

The WLGP RRDA improves the usability of primary care data, supporting timely, scalable analysis of healthcare delivery and system-level trends.

Article activity feed