Clinical coding of long COVID in English primary care: a federated analysis of 58 million patient records in situ using OpenSAFELY

Abstract

Long COVID describes new or persistent symptoms at least 4 weeks after onset of acute COVID-19. Clinical codes to describe this phenomenon were recently created.

Aim

To describe the use of long-COVID codes, and variation of use by general practice, demographic variables, and over time.

Design and setting

Population-based cohort study in English primary care.

Method

Working on behalf of NHS England, OpenSAFELY data were used encompassing 96% of the English population between 1 February 2020 and 25 May 2021. The proportion of people with a recorded code for long COVID was measured overall and by demographic factors, electronic health record software system (EMIS or TPP), and week.

Results

Long COVID was recorded for 23 273 people. Coding was unevenly distributed among practices, with 26.7% of practices having never used the codes. Regional variation ranged between 20.3 per 100 000 people for East of England (95% confidence interval [CI] = 19.3 to 21.4) and 55.6 per 100 000 people in London (95% CI = 54.1 to 57.1). Coding was higher among females (52.1, 95% CI = 51.3 to 52.9) than males (28.1, 95% CI = 27.5 to 28.7), and higher among practices using EMIS (53.7, 95% CI = 52.9 to 54.4) than those using TPP (20.9, 95% CI = 20.3 to 21.4).

Conclusion

Current recording of long COVID in primary care is very low, and variable between practices. This may reflect patients not presenting; clinicians and patients holding different diagnostic thresholds; or challenges with the design and communication of diagnostic codes. Increased awareness of diagnostic codes is recommended to facilitate research and planning of services, and also surveys with qualitative work to better evaluate clinicians’ understanding of the diagnosis.

SciScore for 10.1101/2021.05.06.21256755: (What is this?)

Please note, not all rigor criteria are appropriate for all manuscripts.

Table 1: Rigor

Ethics	not detected.
Sex as a biological variable	not detected.
Randomization	not detected.
Blinding	not detected.
Power Analysis	not detected.

Table 2: Resources

Software and Algorithms
Sentences	Resources
Software and Reproducibility: Data management and analysis was performed using the OpenSAFELY software libraries and Jupyter notebooks, both implemented using Python 3.	Python suggested: (IPython, RRID:SCR_001658)
This is an analysis delivered using federated analysis through the OpenSAFELY platform: codelists and code for data management and data analysis were specified once using the OpenSAFELY tools; then transmitted securely from the OpenSAFELY jobs server to the OpenSAFELY-TPP platform within TPP’s …

SciScore for 10.1101/2021.05.06.21256755: (What is this?)

Please note, not all rigor criteria are appropriate for all manuscripts.

Table 1: Rigor

Ethics	not detected.
Sex as a biological variable	not detected.
Randomization	not detected.
Blinding	not detected.
Power Analysis	not detected.

Table 2: Resources

Software and Algorithms
Sentences	Resources
Software and Reproducibility: Data management and analysis was performed using the OpenSAFELY software libraries and Jupyter notebooks, both implemented using Python 3.	Python suggested: (IPython, RRID:SCR_001658)
This is an analysis delivered using federated analysis through the OpenSAFELY platform: codelists and code for data management and data analysis were specified once using the OpenSAFELY tools; then transmitted securely from the OpenSAFELY jobs server to the OpenSAFELY-TPP platform within TPP’s secure environment, and separately to the OpenSAFELY-EMIS platform within EMIS’s secure environment, where they were each executed separately against local patient data; summary results were then reviewed for disclosiveness, released, and combined for the final outputs.	EMIS’s suggested: None

Results from OddPub: We did not detect open data. We also did not detect open code. Researchers are encouraged to share open data when possible (see Nature blog).

Results from LimitationRecognizer: We detected the following sentences addressing limitations in the study:

Strengths and weaknesses: The key strength of this study is its unprecedented scale: we include over 58 million people, 95% of the population in England. In contrast with many studies that use electronic health record data, we were also able to compare long COVID diagnostic codes between practices that use different software systems, and find a striking disparity: this has important implications for understanding whether clinicians are using the codes appropriately. A key weakness of this data for estimating true prevalence of long COVID in primary care, and factors associated with the condition, is that it relies on clinicians formally entering a diagnostic or referral code into the patient’s record: we note that this is a limitation of all electronic health record research for all clinical conditions and activity; however the emergence of a new diagnosis and recent launch of a new set of diagnostic codes may present new challenges in this regard. Research in Context: To our knowledge there are no other studies on prevalence of long COVID using clinicians’ diagnoses or electronic health records data. There are numerous studies using self-reported data from patients on the prevalence of continued symptoms following COVID-19, with estimates varying between 4.5% and 89%, largely due to highly variable case definitions4; individual symptoms characteristing long COVID have been reported as fatigue, headache, dyspnea and anosmia5. The ONS COVID Infection Survey estimates prevalenc...

Results from TrialIdentifier: No clinical trial numbers were referenced.

Results from Barzooka: We did not find any issues relating to the usage of bar graphs.

Results from JetFighter: We did not find any issues relating to colormaps.

Results from rtransparent:

Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
Thank you for including a funding statement. Authors are encouraged to include this statement when submitting to a journal.
No protocol registration statement was detected.

Results from scite Reference Check: We found no unreliable references.

Read the original source

Clinical coding of long COVID in English primary care: a federated analysis of 58 million patient records in situ using OpenSAFELY

This article has been Reviewed by the following groups

Discuss this preprint

Listed in

Abstract

Aim

Design and setting

Method

Results

Conclusion

Article activity feed