Inference of the SARS-CoV-2 generation time using UK household data

Curation statements for this article:
  • Curated by eLife

    eLife logo

    Evaluation Summary:

    This paper extends a previous analytical method that the authors developed to evaluate the time to infectiousness of COVID-19, in order to evaluate differences in the generation interval across different time periods during the course of the pandemic in England in 2020. This study will be of interest to policymakers and modellers. While the results appear technically robust for the data analysed, its usefulness is limited by difficulty in extending the results to other contexts.

    (This preprint has been reviewed by eLife. We include the public reviews from the reviewers here; the authors also receive private feedback with suggested changes to the manuscript. Reviewer #1 and Reviewer #2 agreed to share their names with the authors.)

This article has been Reviewed by the following groups

Read the full article See related articles

Abstract

The distribution of the generation time (the interval between individuals becoming infected and transmitting the virus) characterises changes in the transmission risk during SARS-CoV-2 infections. Inferring the generation time distribution is essential to plan and assess public health measures. We previously developed a mechanistic approach for estimating the generation time, which provided an improved fit to data from the early months of the COVID-19 pandemic (December 2019-March 2020) compared to existing models (Hart et al., 2021). However, few estimates of the generation time exist based on data from later in the pandemic. Here, using data from a household study conducted from March to November 2020 in the UK, we provide updated estimates of the generation time. We considered both a commonly used approach in which the transmission risk is assumed to be independent of when symptoms develop, and our mechanistic model in which transmission and symptoms are linked explicitly. Assuming independent transmission and symptoms, we estimated a mean generation time (4.2 days, 95% credible interval 3.3–5.3 days) similar to previous estimates from other countries, but with a higher standard deviation (4.9 days, 3.0–8.3 days). Using our mechanistic approach, we estimated a longer mean generation time (5.9 days, 5.2–7.0 days) and a similar standard deviation (4.8 days, 4.0–6.3 days). As well as estimating the generation time using data from the entire study period, we also considered whether the generation time varied temporally. Both models suggest a shorter mean generation time in September-November 2020 compared to earlier months. Since the SARS-CoV-2 generation time appears to be changing, further data collection and analysis is necessary to continue to monitor ongoing transmission and inform future public health policy decisions.

Article activity feed

  1. Author Response:

    Evaluation Summary:

    This paper extends a previous analytical method that the authors developed to evaluate the time to infectiousness of COVID-19, in order to evaluate differences in the generation interval across different time periods during the course of the pandemic in England in 2020. This study will be of interest to policymakers and modellers. While the results appear technically robust for the data analysed, its usefulness is limited by difficulty in extending the results to other contexts.

    We thank the editors for this helpful summary and for recognising the importance of our results for both policymakers and modellers. We provide responses to the comments of Reviewer 1 below to resolve the concerns about the generalisability of our research, indicating how the results are useful in other contexts.

    Reviewer #1 (Public Review):

    This paper extends a previous analytical method that the authors developed to evaluate the time to infectiousness of COVID-19, in order to evaluate differences in the generation interval across different time periods during the course of the pandemic in England in 2020. The time to infectiousness (i.e. how long is it until infected individuals start producing virus in a way that is a risk of infecting others) is a generalisable concept. That is unless we expect there to be inherent differences in the way infected individuals progress to becoming infectious (when looking at distributions of outcomes, comparing between populations of interest) we can take a result from one population of individuals, and assume that it gives us a reasonable idea of how long it takes to become infectious, in another population. Differences in the way people come into contact with each other will have some influence on this, but generally speaking if a person is infectious after 4 days in China, you should be consider a person to be a risk of infecting others after 4 days in other countries as well.

    In contrast, generation time (how long does it take an infected person, on average, to infect the persons they are going to infect?) depends strongly not just on the inherent characteristics of the virus, and progression of disease in individuals, but also (more strongly that time to infectiousness) the circumstances of contact between individuals. Because generation time is tied to so many other factors, one of the most reliable ways to estimate generation times is to analyse data where there are groups of in-contact individuals where there is likely to be highly likely that there is only one generation of transmission involved (where contacts between individuals are clustered, possibly two but with three generations highly unlikely). In this case, the most important unknowns are the time from when individuals are infected to when become infectious and the time to when they test positive - the requirement for time to infectiousness is why the methods used in the initial paper are appropriate for generating better generation time estimates.

    We thank the reviewer for their helpful comments, and are pleased that they recognise that our mechanistic model is appropriate for estimating the generation time. The reviewer is correct that the distribution of the time to infectiousness is likely to be more consistent between settings than that of the generation time, which depends on both the infectiousness of infected hosts at different times since infection and on behavioural factors (for example, if infected individuals self-isolate after developing symptoms, this acts to reduce the generation time; adding this explicit link between symptoms and infectiousness was the main advance of our original eLife article). Unfortunately, however, in many scenarios it is most important to estimate the generation time (rather than inherent infectiousness), since the generation time describes realised transmission. For example, estimates of the timedependent reproduction number depend on the generation time distribution, since it is a characteristic of realised transmission in the population. As a result, obtaining up-to-date and location-specific estimates of the SARS-CoV-2 generation time is crucial, particularly in light of our finding that the generation time changes temporally.

    As most published results relate to the very early stages of the pandemic in China where extensive contact tracing were done, there is some interest in understanding whether the generation times differ substantially in other locations and if they change over time (and therefore, why). In this analysis, Hart et al. estimate generation times across three, three month time periods using household contact data in England in 2020, and show differences in generation time estimates depending on the method used (in particular, when considering an approach which ties infectiousness to symptomatic development which they showed provided better results compared to other methods in their previous paper) and the period of 2020 over which the estimates are taken.

    While the result appears technically robust for the data analysed, its usefulness is limited by difficulty in extending the results - while a different dataset from ones used for the analyses in China they refer to, and from the result of Challen et al. that looked at contacts of international travellers in the UK, it is also in its own way quite specific and further breakdown of possible factors would be worthwhile.

    We agree with the reviewer that investigating whether the generation time varies by location and temporally is an interesting research question. Since, as we show, the generation time actually does vary temporally, it is crucial to monitor the generation time during epidemics and use the most up-to-date estimates when analysing population-level transmission. While we used data from households in our analyses, our approach corrects for the regularity of household contacts to obtain widely applicable generation time estimates (see the revised manuscript and our response to the reviewer’s next point below). Since household data are routinely collected, we contend that this manuscript provides a useful advance on our previous manuscript (which considered data from known transmission pairs) by providing a general framework for estimating the generation time, as well as some of the most up-to-date SARS-CoV-2 generation time estimates currently available. We also agree with the reviewer that a further breakdown of possible factors would be a worthwhile extension of this research. Of course, doing this would require data on the characteristics of individuals and households (e.g. ages or socio-economic statuses of different individuals) to be available. In the Discussion of the revised manuscript, we explain the need to conduct such analyses in future to understand how the generation time depends on specific characteristics more clearly.

    First, the limitations to household contacts means that it is not representative of general transmission in the population - household contacts are high risk, with many opportunities for transmission and may therefore be relatively short. Generalised contacts outside of households are likely to be less frequent and often of shorter duration and more strongly affected by diurnal and weekly rhythms.

    We agree that the high frequency of household contacts would be expected to lead to shorter generation times within households than in the wider population. However, we explicitly correct for this in our analysis. In the revised manuscript, we now highlight in both the Results and the Discussion that we include the regularity of household contacts and the availability of susceptible hosts in households in the likelihood function to derive widely applicable estimates of the generation time. These estimates, which correspond to the generation time assuming a constant supply of susceptibles during infection, can then be conditioned to specific population structures. For example, we estimated the realised generation times within the study households in Figure 1-figure supplement 4. As expected, these household generation times are shorter than our main estimates in Figure 1. Moreover, our work demonstrates the important principle that changes in the generation time can be detected using data from household studies, highlighting both the importance of continued monitoring of the generation time and the role of household data in monitoring efforts (see the revised manuscript). Finally, we note that household data have previously been used to estimate the generation time for other pathogens – see particularly the highly cited study of influenza by Ferguson et al. (https://doi.org/10.1038/nature04017) to which we refer in our manuscript.

    Second, it is also known that demographic factors such as ethnicity and income are strongly linked to infection and severe infection risk. While this does not tell us directly about any links to infectiousness and infectious contact, it is reasonable to consider a connection - and therefore a link to generation times. As such, in this relatively small sample (172 households, with much higher numbers in the first 3 months, compared to the middle or last three) differences in demographics may influence generation times as well.

    While we agree with the reviewer that the accuracy of our estimates may have been impacted if the study households were not representative of the wider population, we do not believe this caveat to be any more specific to our study than to other studies in which the SARS-CoV-2 generation time has been estimated. In fact, our sample size is larger than those used in all other such studies of which we are aware. We discuss this point in our revised manuscript and note that comparing the generation time between individuals/households of different characteristics is an interesting and important area for future work (see revised manuscript).

    Finally, the alpha variant, first identified in Kent, was probably circulating for much of the final three months of this analysis - dominant by early 2021 in the UK, it would have had a variable proportion across much of those final three months, and also varied geographically in terms of proportion as well, with a much earlier rise in the SE and in London). Unless those proportions are known, it would be difficult to know how much differences in generation times are due to the variant, to demographics, or other, possibly behavioural factors. Thus some caution should be applied before taking general lessons from it, at least in the absence of those additional considerations.

    Thank you for this interesting comment. In fact, the Public Health England household study underlying our results included genomic surveillance. The Alpha variant was only responsible for infections in two study households, so we can be confident that this variant was not responsible for our finding of a temporal decrease in the generation time. Since this is an important point, we have now stated it clearly in both the Results and Discussion of the revised manuscript. If more recent data become available, obtaining further updated generation time estimates in light of novel variants is an important area of future work (as noted in the revised submission).

    Reviewer #2 (Public Review):

    In this work, Hart et al infer the generation interval for SARS-CoV-2 using infector-infectee pairs from household data. The generation interval is obtained across three different time intervals (March-April, May-August and September-November) and using both an "independent transmission" model and the "mechanistic" model that was originally proposed in Hart et al 2021. The main result is that the inferred generation interval in September-November has decreased compared to the earlier months of the pandemic, irrespective of the model considered. Overall, the conclusions drawn in the paper are well supported and have been shown to be robust through a thorough sensitivity analysis.

    We thank the reviewer for their useful comments and suggestions, and are pleased that the reviewer considers our conclusions to be well supported and robust.

    Strengths

    • They use a mechanistic model to account for the change in infectivity at symptom onset.
    • A major strength of this investigation is that they can observe the dynamics of the generation time over three different time periods of the pandemic. To my knowledge, this is a novel result that allows for a more up to date understanding of SARS-CoV-2 transmission.
    • Whilst not highlighted in the text, it appears that there has been significant effort to extend the likelihood function to appropriately model household dynamics. This is non-trivial work in my opinion, and I believe the details of the derivation will be of use to mathematical modellers that deal with susceptible depletion in their data.

    We thank the reviewer for highlighting some of the key strengths of our study. We agree that the methodological advance in this study is important and useful for epidemiological modellers, and we thank the reviewer for encouraging us to highlight this more clearly. We have therefore followed the reviewer’s suggestion by adding a paragraph to the Results in which we summarise the methodological advance required to fit the models developed in our previous work to data from households rather than infectorinfectee pairs.

    Weaknesses

    • The main weakness of the paper in its current form is that the analysis appears superficial, with a large amount of curve fitting and very little explanation. It would be beneficial if the authors delved more deeply into their results, especially with the mechanistic model. It would be very interesting to relate the changes in generation time to mechanisms of transmission.

    While the primary aim of this research was to obtain updated generation time estimates and demonstrate the key principle that this important quantity is changing, in our revised submission we have extended the analyses within and around Figure 3 to delve deeper into the finding of a temporal decrease in the generation time. First, we have added a new panel to Figure 3 (panel C in the revised submission) in which we show that the predicted decrease in generation time was accompanied by an increase in the proportion of presymptomatic transmissions, with a very high 83% of transmissions predicted to occur before symptom onset (among infectors who developed symptoms) in September-November. We note in the Discussion that this finding is consistent with our hypothesis that a shorter generation time in the autumn months may have resulted from increased indoor contacts as the weather became colder, particularly among individuals without COVID-19 symptoms (whereas symptomatic hosts were still expected to self-isolate.

    Second, as suggested by the reviewer below, we have added a new figure (Figure 3- figure supplement 3) in which we compare the generation time distribution itself between the three different time periods (compared to Figure 3, where we focus on the mean and standard deviation of this distribution), as well as the distributions of the time from symptom onset to transmission (TOST) and the serial interval. Both models indicate that the transmission risk peaked earlier in infection for individuals infecte. Third, we have added a figure (Figure 3-figure supplement 5) in which we compare estimates of individual model parameters for the mechanistic model between the different time periods. As described in the revised manuscript, this showed that our finding of a shorter generation time and higher proportion of presymptomatic transmissions in September-November compared to earlier months may have resulted from any of: (i) an increase in the relative infectiousness of presymptomatic infectious infectors compared to symptomatic infectors (which is consistent with the hypothesis of increased indoor mixing among non-symptomatic individuals described above); (ii) a decrease in the (mean) duration of the symptomatic infectious period (which could, for example, result from faster isolation of symptomatic individuals); or (iii) a decrease in the (mean) time to infectiousness. However, since there was substantial overlap in the credible intervals for each individual parameter between the time periods, it was not possible to definitively identify the parameter(s) responsible for the observed change in the generation time.

    • The authors calculate the mean and standard deviation of the generation interval across three different time points; however, they only present one figure with the distribution of the generation time (Figure 2). It would be interesting to know how the generation time distribution changes in time, as opposed to just the mean and standard deviation. I believe that such an analysis would link nicely to their previous work, where they highlight the importance of ongoing public health measures such as contact tracing.

    As described in our response to the previous point above, we have implemented this excellent suggestion in our revised submission.

    Reviewer #3 (Public Review):

    The authors have previously published a mechanistic model for inferring infectiousness profile that explicitly models dependence of the risk of onward transmission on the onset of symptoms on an individual. In the present study, they apply this model as well as another more commonly used model which assumes these two things (transmission risk and onset of symptoms) to be independent, to data from a household study conducted from March-Nov 2020 in the UK. Both the models find that the mean generation time in Sept-Nov 2020 is shorter than in the earlier periods of the study.

    This is well-presented study with careful analysis and extensive sensitive analysis which shows that the modelled estimates are robust to a range of assumptions.

    We are pleased that the reviewer found our study to be well-presented and for recognising the significant sensitivity analyses that we performed to ensure that our results are robust.

  2. Evaluation Summary:

    This paper extends a previous analytical method that the authors developed to evaluate the time to infectiousness of COVID-19, in order to evaluate differences in the generation interval across different time periods during the course of the pandemic in England in 2020. This study will be of interest to policymakers and modellers. While the results appear technically robust for the data analysed, its usefulness is limited by difficulty in extending the results to other contexts.

    (This preprint has been reviewed by eLife. We include the public reviews from the reviewers here; the authors also receive private feedback with suggested changes to the manuscript. Reviewer #1 and Reviewer #2 agreed to share their names with the authors.)

  3. Reviewer #1 (Public Review):

    This paper extends a previous analytical method that the authors developed to evaluate the time to infectiousness of COVID-19, in order to evaluate differences in the generation interval across different time periods during the course of the pandemic in England in 2020. The time to infectiousness (i.e. how long is it until infected individuals start producing virus in a way that is a risk of infecting others) is a generalisable concept. That is unless we expect there to be inherent differences in the way infected individuals progress to becoming infectious (when looking at distributions of outcomes, comparing between populations of interest) we can take a result from one population of individuals, and assume that it gives us a reasonable idea of how long it takes to become infectious, in another population. Differences in the way people come into contact with each other will have some influence on this, but generally speaking if a person is infectious after 4 days in China, you should be consider a person to be a risk of infecting others after 4 days in other countries as well.

    In contrast, generation time (how long does it take an infected person, on average, to infect the persons they are going to infect?) depends strongly not just on the inherent characteristics of the virus, and progression of disease in individuals, but also (more strongly that time to infectiousness) the circumstances of contact between individuals. Because generation time is tied to so many other factors, one of the most reliable ways to estimate generation times is to analyse data where there are groups of in-contact individuals where there is likely to be highly likely that there is only one generation of transmission involved (where contacts between individuals are clustered, possibly two but with three generations highly unlikely). In this case, the most important unknowns are the time from when individuals are infected to when become infectious and the time to when they test positive - the requirement for time to infectiousness is why the methods used in the initial paper are appropriate for generating better generation time estimates.

    As most published results relate to the very early stages of the pandemic in China where extensive contact tracing were done, there is some interest in understanding whether the generation times differ substantially in other locations and if they change over time (and therefore, why). In this analysis, Hart et al. estimate generation times across three, three month time periods using household contact data in England in 2020, and show differences in generation time estimates depending on the method used (in particular, when considering an approach which ties infectiousness to symptomatic development which they showed provided better results compared to other methods in their previous paper) and the period of 2020 over which the estimates are taken. While the result appears technically robust for the data analysed, its usefulness is limited by difficulty in extending the results - while a different dataset from ones used for the analyses in China they refer to, and from the result of Challen et al. that looked at contacts of international travellers in the UK, it is also in its own way quite specific and further breakdown of possible factors would be worthwhile. First, the limitations to household contacts means that it is not representative of general transmission in the population - household contacts are high risk, with many opportunities for transmission and may therefore be relatively short. Generalised contacts outside of households are likely to be less frequent and often of shorter duration and more strongly affected by diurnal and weekly rhythms. Second, it is also known that demographic factors such as ethnicity and income are strongly linked to infection and severe infection risk. While this does not tell us directly about any links to infectiousness and infectious contact, it is reasonable to consider a connection - and therefore a link to generation times. As such, in this relatively small sample (172 households, with much higher numbers in the first 3 months, compared to the middle or last three) differences in demographics may influence generation times as well. Finally, the alpha variant, first identified in Kent, was probably circulating for much of the final three months of this analysis - dominant by early 2021 in the UK, it would have had a variable proportion across much of those final three months, and also varied geographically in terms of proportion as well, with a much earlier rise in the SE and in London). Unless those proportions are known, it would be difficult to know how much differences in generation times are due to the variant, to demographics, or other, possibly behavioural factors. Thus some caution should be applied before taking general lessons from it, at least in the absence of those additional considerations.

  4. Reviewer #2 (Public Review):

    In this work, Hart et al infer the generation interval for SARS-CoV-2 using infector-infectee pairs from household data. The generation interval is obtained across three different time intervals (March-April, May-August and September-November) and using both an "independent transmission" model and the "mechanistic" model that was originally proposed in Hart et al 2021. The main result is that the inferred generation interval in September-November has decreased compared to the earlier months of the pandemic, irrespective of the model considered. Overall, the conclusions drawn in the paper are well supported and have been shown to be robust through a thorough sensitivity analysis.

    Strengths

    - They use a mechanistic model to account for the change in infectivity at symptom onset.
    - A major strength of this investigation is that they can observe the dynamics of the generation time over three different time periods of the pandemic. To my knowledge, this is a novel result that allows for a more up to date understanding of SARS-CoV-2 transmission.
    - Whilst not highlighted in the text, it appears that there has been significant effort to extend the likelihood function to appropriately model household dynamics. This is non-trivial work in my opinion, and I believe the details of the derivation will be of use to mathematical modellers that deal with susceptible depletion in their data.

    Weaknesses

    - The main weakness of the paper in its current form is that the analysis appears superficial, with a large amount of curve fitting and very little explanation. It would be beneficial if the authors delved more deeply into their results, especially with the mechanistic model. It would be very interesting to relate the changes in generation time to mechanisms of transmission.
    - The authors calculate the mean and standard deviation of the generation interval across three different time points; however, they only present one figure with the distribution of the generation time (Figure 2). It would be interesting to know how the generation time distribution changes in time, as opposed to just the mean and standard deviation. I believe that such an analysis would link nicely to their previous work, where they highlight the importance of ongoing public health measures such as contact tracing.

  5. Reviewer #3 (Public Review):

    The authors have previously published a mechanistic model for inferring infectiousness profile that explicitly models dependence of the risk of onward transmission on the onset of symptoms on an individual. In the present study, they apply this model as well as another more commonly used model which assumes these two things (transmission risk and onset of symptoms) to be independent, to data from a household study conducted from March-Nov 2020 in the UK. Both the models find that the mean generation time in Sept-Nov 2020 is shorter than in the earlier periods of the study.

    This is well-presented study with careful analysis and extensive sensitive analysis which shows that the modelled estimates are robust to a range of assumptions.

  6. SciScore for 10.1101/2021.05.27.21257936: (What is this?)

    Please note, not all rigor criteria are appropriate for all manuscripts.

    Table 1: Rigor

    NIH rigor criteria are not applicable to paper type.

    Table 2: Resources

    No key resources detected.


    Results from OddPub: We did not detect open data. We also did not detect open code. Researchers are encouraged to share open data when possible (see Nature blog).


    Results from LimitationRecognizer: We detected the following sentences addressing limitations in the study:
    Our study has some limitations. Since we used household transmission data in our analyses, the generation time for transmission outside the household may differ from our estimates. Future extensions to our approach may account for the possibility that more than one household member was infected in the same primary infection event or the potential for multiple sequential introductions of the virus into a household [37]. Allowing for multiple introductions may shorten estimates of the generation time, although any effect will dependent significantly on the community prevalence and the number of contacts that household members have with individuals in the community. In contrast, accounting for potential co-primary infections is likely to lead to higher estimates of the generation time. Other further work may include exploring heterogeneity in the generation time distribution between individuals and/or households with different characteristics. This could involve, but is not limited to, estimating the generation time distribution for individuals of different age, sex and ethnicity. In summary, we have inferred the SARS-CoV-2 generation time distribution in the UK using household data and two different transmission models. A key output of this research is one of the only estimates of the SARS-CoV-2 generation time outside Asia. Another crucial feature of our analysis is that it was based on data from beyond the first few months of the pandemic. Since this research suggests that th...

    Results from TrialIdentifier: No clinical trial numbers were referenced.


    Results from Barzooka: We did not find any issues relating to the usage of bar graphs.


    Results from JetFighter: We did not find any issues relating to colormaps.


    Results from rtransparent:
    • Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
    • Thank you for including a funding statement. Authors are encouraged to include this statement when submitting to a journal.
    • No protocol registration statement was detected.

    Results from scite Reference Check: We found no unreliable references.


    About SciScore

    SciScore is an automated tool that is designed to assist expert reviewers by finding and presenting formulaic information scattered throughout a paper in a standard, easy to digest format. SciScore checks for the presence and correctness of RRIDs (research resource identifiers), and for rigor criteria such as sex and investigator blinding. For details on the theoretical underpinning of rigor criteria and the tools shown here, including references cited, please follow this link.