Rapid feedback on hospital onset SARS-CoV-2 infections combining epidemiological and sequencing data

Curation statements for this article:
  • Curated by eLife

    eLife logo

    Evaluation Summary:

    The paper describes an algorithm that combines epidemiological and sequence data to provide a rapid assessment of the probability of healthcare-associated infections among hospital onset SARS-CoV-2 infections, that also may be associated with outbreak events. There is an urgent need for tools that can synthesise multiple data streams to provide real time information to healthcare professionals. It is questionable to what extent the tool presented is generalisable to medical facilities outside of the specific data rich settings considered here, or if the tool is useful for prospective analyses. This study would be of interest to specialists working in hospital infection prevention, with more limited further interest.

    (This preprint has been reviewed by eLife. We include the public reviews from the reviewers here; the authors also receive private feedback with suggested changes to the manuscript. Reviewer #1 agreed to share their name with the authors.)

This article has been Reviewed by the following groups

Read the full article See related articles

Abstract

Rapid identification and investigation of healthcare-associated infections (HCAIs) is important for suppression of SARS-CoV-2, but the infection source for hospital onset COVID-19 infections (HOCIs) cannot always be readily identified based only on epidemiological data. Viral sequencing data provides additional information regarding potential transmission clusters, but the low mutation rate of SARS-CoV-2 can make interpretation using standard phylogenetic methods difficult.

Methods:

We developed a novel statistical method and sequence reporting tool (SRT) that combines epidemiological and sequence data in order to provide a rapid assessment of the probability of HCAI among HOCI cases (defined as first positive test >48 hr following admission) and to identify infections that could plausibly constitute outbreak events. The method is designed for prospective use, but was validated using retrospective datasets from hospitals in Glasgow and Sheffield collected February–May 2020.

Results:

We analysed data from 326 HOCIs. Among HOCIs with time from admission ≥8 days, the SRT algorithm identified close sequence matches from the same ward for 160/244 (65.6%) and in the remainder 68/84 (81.0%) had at least one similar sequence elsewhere in the hospital, resulting in high estimated probabilities of within-ward and within-hospital transmission. For HOCIs with time from admission 3–7 days, the SRT probability of healthcare acquisition was >0.5 in 33/82 (40.2%).

Conclusions:

The methodology developed can provide rapid feedback on HOCIs that could be useful for infection prevention and control teams, and warrants further prospective evaluation. The integration of epidemiological and sequence data is important given the low mutation rate of SARS-CoV-2 and its variable incubation period.

Funding:

COG-UK HOCI funded by COG-UK consortium, supported by funding from UK Research and Innovation, National Institute of Health Research and Wellcome Sanger Institute.

Article activity feed

  1. Author Response:

    Evaluation Summary:

    The paper describes an algorithm that combines epidemiological and sequence data to provide a rapid assessment of the probability of healthcare-associated infections among hospital onset SARS-CoV-2 infections, that also may be associated with outbreak events. There is an urgent need for tools that can synthesise multiple data streams to provide real time information to healthcare professionals. It is questionable to what extent the tool presented is generalisable to medical facilities outside of the specific data rich settings considered here, or if the tool is useful for prospective analyses. This study would be of interest to specialists working in hospital infection prevention, with more limited further interest.

    We thank eLife for the commentary on our work. We agree that there is a need for robust prospective evaluation of routine viral sequencing of SARS-CoV-2 for Infection Prevention and Control and of this tool specifically. Our research group is conducting such work within a multi- centre prospective study that is currently ongoing https://clinicaltrials.gov/ct2/show/NCT04405934, https://doi.org/10.1101/2021.04.13.21255342.

    Reviewer #1 (Public Review):

    -In the present paper the authors have attempted to develop a novel statistical method and sequence reporting tool that combines epidemiological and sequence data to provide a rapid assessment of the probability of HCAI among HOCI cases (defined as first positive test >48 hours following admission) and to identify infections that could plausibly constitute outbreak events.

    -As healthcare-associated infections in hospitals present a significant health risk to both vulnerable patients and healthcare workers, significant improvements to provide a rapid assessment of the probability of HCAI among HOCI cases is of utmost importance in a pandemic setting.

    -The strength of the paper is that they have successfully used a large number of virus sequence data from two UK cities with selected hospitals and developed a statistical method to bring these together with classical epidemiological data, which has resulted in a sequence reporting tool (SRT) that was evaluated in relation to:

    -The IPC classification system recommended by PHE,

    -The PHE definition of healthcare-associated COVID-19 outbreaks (using a 2 SNP threshold).

    -They show the added value of combining the two systems. Obviously, this can only work prospectively in a setting like in the UK, where indeed a system like the COVID-19 Genomics (COG) UK initiative is effectively in place. They conclude that through their retrospective application to clinical datasets, to have demonstrated that the methodology is able to provide confirmatory evidence for most PHE-defined definite and probable HCAIs and provide further information regarding indeterminate HCAIs. Therefor, the SRT may allow IPC teams to optimise their use of resources on areas with likely nosocomial acquisition events.

    -The acquisition of the extensive prospective datasets necessary to use the system requires a non-negligible investment that is possible in a setting in which sequencing routine and phylogenetic analyses can be carried out in real time. The added value of the methodology should eventually justify the investment.

    We thank the reviewer for their summary and commentary on our work. We agree that full evaluation of the use of viral sequencing for clinical practice requires health economic analysis of the associated costs relative to potential gains, and this is planned within our ongoing research program on this topic.

    Reviewer #2 (Public Review):

    Since early 2020, the SARS-CoV-2 pandemic has presented numerous challenges to healthcare facilities around the world. Given the highly transmissible nature of SARS-CoV-2 virus, and the confined nature of most hospital settings, hospital acquired infections with SARS-CoV-2 are a frequent occurrence and pose major challenges for hospital infection prevention teams. The increasing use of genomic epidemiology, facilitated by cheaper/faster genetic sequencing tools and user-friendly algorithms for data analysis, creates new opportunities for using virus sequencing to track virus spread in healthcare facilities. While opportunities are increasing, there remain two important bottlenecks to meaningful and widespread use of genomic epidemiology in well-resourced healthcare settings - 1. the turnaround time from sample collection to delivery of sequenced and analysed result; 2. a lack of training among many infection prevention personnel in interpreting genomic epidemiology output.

    The study by Stirrup et al tries to alleviate these issues through the development of an algorithm that synthesises inferences from virus genetic sequences and hospital epidemiological data to provide easy to interpret information about whether or not there is likely to be ongoing virus transmission within a medical facility. In general, these kinds of approaches are highly worthwhile and can have important translational value as they facilitate the use of powerful new technologies without necessarily requiring extensive professional training to interpret the results. Indeed, there is an urgent need for tools that can synthesise multiple data streams to provide real time information to healthcare professionals.

    In this study, the authors describe their new algorithm and apply it in two retrospective cases to evaluate its potential value to provide valuable information to infection control teams. While it seems clear that the algorithm reliably detects nosocomial transmission in situations where there are obvious hospital outbreaks, it is much less clear that it performs meaningfully in situations where nosocomial transmission is more questionable. To this end, it is not clear if the algorithm provides useful or meaningful information that would help to reduce the burden of hospital acquired SARS-CoV-2 infections. Towards the end of the discussion section, the authors mention that analyses on the utility of the algorithm in prospective use cases were ongoing from late 2020 to early 2021. These analyses will provide essential information on the value of this tool.

    While the development of these sorts of tools is important, it is unclear from this study if the tool has value in prospective use or if it would be useful in settings where virus genetic sequencing is less frequent and/or slower than the retrospective use cases considered here. Additionally, in many infection prevention scenarios the existence of an outbreak is clear but tracing the routes of transmission is the primary object of investigation. Because the algorithm does not include phylogenetic information infection tracing potential transmission routes is not possible.

    We thank the Reviewer for their commentary on our work. Our ongoing prospective study on implementation of the reporting tool includes intervention phases both with a ‘rapid’ target turnaround of 48 hours from sampling and with a ‘slow’ target turnaround of 5-10 days, and this will generate data on the relative utility of viral sequencing within these timeframes. We acknowledge that the reporting tool developed does not evaluate evidence of direct transmission between case pairs, although it should also be noted that phylogenetic investigation alone cannot be used to confidently infer direct transmission linkage for SARS-CoV-2. We feel that the algorithm and report format can flag potential transmission routes to IPC teams, through the identification of close sequence matches within the hospital as a whole and highlighting of any matching previous ward locations (although the latter is not used in the probability calculations).

  2. Reviewer #2 (Public Review):

    Since early 2020, the SARS-CoV-2 pandemic has presented numerous challenges to healthcare facilities around the world. Given the highly transmissible nature of SARS-CoV-2 virus, and the confined nature of most hospital settings, hospital acquired infections with SARS-CoV-2 are a frequent occurrence and pose major challenges for hospital infection prevention teams. The increasing use of genomic epidemiology, facilitated by cheaper/faster genetic sequencing tools and user-friendly algorithms for data analysis, creates new opportunities for using virus sequencing to track virus spread in healthcare facilities. While opportunities are increasing, there remain two important bottlenecks to meaningful and widespread use of genomic epidemiology in well-resourced healthcare settings - 1. the turnaround time from sample collection to delivery of sequenced and analysed result; 2. a lack of training among many infection prevention personnel in interpreting genomic epidemiology output.

    The study by Stirrup et al tries to alleviate these issues through the development of an algorithm that synthesises inferences from virus genetic sequences and hospital epidemiological data to provide easy to interpret information about whether or not there is likely to be ongoing virus transmission within a medical facility. In general, these kinds of approaches are highly worthwhile and can have important translational value as they facilitate the use of powerful new technologies without necessarily requiring extensive professional training to interpret the results. Indeed, there is an urgent need for tools that can synthesise multiple data streams to provide real time information to healthcare professionals.

    In this study, the authors describe their new algorithm and apply it in two retrospective cases to evaluate its potential value to provide valuable information to infection control teams. While it seems clear that the algorithm reliably detects nosocomial transmission in situations where there are obvious hospital outbreaks, it is much less clear that it performs meaningfully in situations where nosocomial transmission is more questionable. To this end, it is not clear if the algorithm provides useful or meaningful information that would help to reduce the burden of hospital acquired SARS-CoV-2 infections. Towards the end of the discussion section, the authors mention that analyses on the utility of the algorithm in prospective use cases were ongoing from late 2020 to early 2021. These analyses will provide essential information on the value of this tool.

    While the development of these sorts of tools is important, it is unclear from this study if the tool has value in prospective use or if it would be useful in settings where virus genetic sequencing is less frequent and/or slower than the retrospective use cases considered here. Additionally, in many infection prevention scenarios the existence of an outbreak is clear but tracing the routes of transmission is the primary object of investigation. Because the algorithm does not include phylogenetic information infection tracing potential transmission routes is not possible.

  3. Reviewer #1 (Public Review):

    -In the present paper the authors have attempted to develop a novel statistical method and sequence reporting tool that combines epidemiological and sequence data to provide a rapid assessment of the probability of HCAI among HOCI cases (defined as first positive test >48 hours following admission) and to identify infections that could plausibly constitute outbreak events.

    -As healthcare-associated infections in hospitals present a significant health risk to both vulnerable patients and healthcare workers, significant improvements to provide a rapid assessment of the probability of HCAI among HOCI cases is of utmost importance in a pandemic setting.

    -The strength of the paper is that they have successfully used a large number of virus sequence data from two UK cities with selected hospitals and developed a statistical method to bring these together with classical epidemiological data, which has resulted in a sequence reporting tool (SRT) that was evaluated in relation to:

    -The IPC classification system recommended by PHE,

    -The PHE definition of healthcare-associated COVID-19 outbreaks (using a 2 SNP threshold).

    -They show the added value of combining the two systems. Obviously, this can only work prospectively in a setting like in the UK, where indeed a system like the COVID-19 Genomics (COG) UK initiative is effectively in place. They conclude that through their retrospective application to clinical datasets, to have demonstrated that the methodology is able to provide confirmatory evidence for most PHE-defined definite and probable HCAIs and provide further information regarding indeterminate HCAIs. Therefor, the SRT may allow IPC teams to optimise their use of resources on areas with likely nosocomial acquisition events.

    -The acquisition of the extensive prospective datasets necessary to use the system requires a non-negligible investment that is possible in a setting in which sequencing routine and phylogenetic analyses can be carried out in real time. The added value of the methodology should eventually justify the investment.

  4. Evaluation Summary:

    The paper describes an algorithm that combines epidemiological and sequence data to provide a rapid assessment of the probability of healthcare-associated infections among hospital onset SARS-CoV-2 infections, that also may be associated with outbreak events. There is an urgent need for tools that can synthesise multiple data streams to provide real time information to healthcare professionals. It is questionable to what extent the tool presented is generalisable to medical facilities outside of the specific data rich settings considered here, or if the tool is useful for prospective analyses. This study would be of interest to specialists working in hospital infection prevention, with more limited further interest.

    (This preprint has been reviewed by eLife. We include the public reviews from the reviewers here; the authors also receive private feedback with suggested changes to the manuscript. Reviewer #1 agreed to share their name with the authors.)

  5. SciScore for 10.1101/2020.11.12.20230326: (What is this?)

    Please note, not all rigor criteria are appropriate for all manuscripts.

    Table 1: Rigor

    Institutional Review Board Statementnot detected.
    Randomizationnot detected.
    Blindingnot detected.
    Power Analysisnot detected.
    Sex as a biological variablenot detected.

    Table 2: Resources

    No key resources detected.


    Results from OddPub: We did not detect open data. We also did not detect open code. Researchers are encouraged to share open data when possible (see Nature blog).


    Results from LimitationRecognizer: We detected the following sentences addressing limitations in the study:
    A limitation of the current SRT approach and of the retrospective data available is that they do not include detailed information regarding work locations for HCWs. However, prospective use of the SRT would allow IPC teams to investigate linkage from a HOCI to any HCWs flagged as having a close sequence match. While a phylogenetic approach is useful in excluding direct transmission between cases, it can be more problematic to confirm transmission source[26]. Phylogenetic models can evaluate the full genetic information provided by viral sequence data, but there are challenges in incorporating and summarising associated patient meta-data in a timely fashion[27]. There will be cases in which phylogenetic analysis would provide information beyond that returned by the SRT. However, fully integrated epidemiological and phylogenetic analysis of hospital outbreaks is resource-intensive, presenting challenges in delivering the rapid turnaround and scale-up required to provide clear feedback to hospital IPC teams outside of research-intensive settings. Comparison of SRT output to phylogenetic trees in a number of test cases suggested that some clusters of genetically similar cases identified within a specific ward likely represented more than one transmission event onto the ward from similar viral lineages circulating within the healthcare system. Whilst monophyletic clusters associated with a single location are easier to interpret, we consider the presence of viruses within a ward o...

    Results from TrialIdentifier: No clinical trial numbers were referenced.


    Results from Barzooka: We did not find any issues relating to the usage of bar graphs.


    Results from JetFighter: We did not find any issues relating to colormaps.


    Results from rtransparent:
    • Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
    • Thank you for including a funding statement. Authors are encouraged to include this statement when submitting to a journal.
    • No protocol registration statement was detected.

    About SciScore

    SciScore is an automated tool that is designed to assist expert reviewers by finding and presenting formulaic information scattered throughout a paper in a standard, easy to digest format. SciScore checks for the presence and correctness of RRIDs (research resource identifiers), and for rigor criteria such as sex and investigator blinding. For details on the theoretical underpinning of rigor criteria and the tools shown here, including references cited, please follow this link.