Direct and indirect mortality impacts of the COVID-19 pandemic in the United States, March 1, 2020 to January 1, 2022

Curation statements for this article:
  • Curated by eLife

    eLife logo

    Evaluation Summary:

    The authors examine the impacts of the COVID-19 pandemic on excess mortality in the US up to April 30, 2021. The authors separate direct impacts (caused by COVID-19, coded as such or not) of the pandemic from indirect impacts (disruptions), finding that most excess deaths (90%) are due to direct impacts. Importantly, the authors find that the official COVID-19 death tally is an undercount of these deaths. Moreover, the authors also find that excess deaths due to other causes are the main driver of excess mortality among younger populations. The paper is interesting and well written, although we have some concerns, particularly around the estimation of direct vs. indirect impacts.

    (This preprint has been reviewed by eLife. We include the public reviews from the reviewers here; the authors also receive private feedback with suggested changes to the manuscript. The reviewers remained anonymous to the authors.)

This article has been Reviewed by the following groups

Read the full article See related articles

Abstract

Excess mortality studies provide crucial information regarding the health burden of pandemics and other large-scale events. Here, we use time series approaches to separate the direct contribution of SARS-CoV-2 infection on mortality from the indirect consequences of the pandemic in the United States. We estimate excess deaths occurring above a seasonal baseline from March 1, 2020 to January 1, 2022, stratified by week, state, age, and underlying mortality condition (including COVID-19 and respiratory diseases; Alzheimer’s disease; cancer; cerebrovascular diseases; diabetes; heart diseases; and external causes, which include suicides, opioid overdoses, and accidents). Over the study period, we estimate an excess of 1,065,200 (95% Confidence Interval (CI) 909,800–1,218,000) all-cause deaths, of which 80% are reflected in official COVID-19 statistics. State-specific excess death estimates are highly correlated with SARS-CoV-2 serology, lending support to our approach. Mortality from 7 of the 8 studied conditions rose during the pandemic, with the exception of cancer. To separate the direct mortality consequences of SARS-CoV-2 infection from the indirect effects of the pandemic, we fit generalized additive models (GAM) to age- state- and cause-specific weekly excess mortality, using covariates representing direct (COVID-19 intensity) and indirect pandemic effects (hospital intensive care unit (ICU) occupancy and measures of interventions stringency). We find that 84% (95% CI 65–94%) of all-cause excess mortality can be statistically attributed to the direct impact of SARS-CoV-2 infection. We also estimate a large direct contribution of SARS-CoV-2 infection (≥67%) on mortality from diabetes, Alzheimer’s, heart diseases, and in all-cause mortality among individuals over 65 years. In contrast, indirect effects predominate in mortality from external causes and all-cause mortality among individuals under 44 years, with periods of stricter interventions associated with greater rises in mortality. Overall, on a national scale, the largest consequences of the COVID-19 pandemic are attributable to the direct impact of SARS-CoV-2 infections; yet, the secondary impacts dominate among younger age groups and in mortality from external causes. Further research on the drivers of indirect mortality is warranted as more detailed mortality data from this pandemic becomes available.

Article activity feed

  1. Author Response

    Reviewer #1 (Public Review):

    This is a very interesting paper trying to quantify excess deaths due to the COVID-19 pandemic in the USA. The paper is roughly divided into two main sections. In the first section, the authors estimate age and cause-specific excess mortality. In the second section, using their excess mortality estimates, the authors attempt to disentangle the impact of SARS-CoV-2 infection (direct impact) vs. the impact of NPIs on this excess mortality (indirect impact). I have some concerns, particularly with respect to the second section.

    The model used to estimate excess mortality is quite clear. The authors adjust the baseline model to account for low influenza circulation (and deaths) during the COVID-19 pandemic, to avoid underestimating the number of deaths caused by COVID-19. While this makes sense if the authors are trying to estimate the total number of deaths caused by COVID-19, I'm not sure it needs to be accounted for if the authors want to estimate excess/added deaths. A counterfactual scenario would've included influenza. It also raises the question of whether (conceptually) they should be adjusting for other causes of deaths that may have also decreased during the pandemic. The authors briefly acknowledge this in the discussion ("we can't account for changes in baseline respiratory mortality due to depressed circulation of endemic pathogens other than influenza") but my comment goes beyond respiratory diseases. Analyses of excess mortality from other settings have suggested, for example, decreased deaths due to fewer traffic accidents (not in the US) or due to decreased air pollution, and not accounting for these would also lead to an underestimate of the total deaths caused by COVID-19. I understand that it is not feasible to account for all potential factors, so I wonder if they should focus on reporting excess deaths as compared to a counterfactual with influenza.

    Thanks. We think it is helpful to “single out” influenza as it causes major fluctuations in mortality from multiple causes in regular years and is a useful reference to contrast the pandemic impact. But the reviewer’s point is well taken. We have clarified our assumptions about the meaning of the baseline in this analysis (methods p 5), discussed the depressed circulation of other pathogens in depth, and mentioned air pollution (p 12-13). We have also slightly reworked our comparison between COVID19 and influenza so that excess mortality estimates are comparable and now cover periods of the same duration (Nov 2017-Mar 2018 for flu and Nov 2020-Mar 2021 for COVID19, see Figure S11).

    The second section, trying to estimate direct vs. indirect effects is also very interesting. However, more details are required about the regression model used and, importantly, what the assumptions and limitations of the approach are. Specifically:

    • Please provide a bit more information on the regression used for direct vs. indirect effects. I'd like to see explicit discussion of the assumptions and limitations of the approach but also of the stringency index used. Does this model include an intercept? Was the association between stringency index and excess deaths assumed to be linear? Or were different functional forms considered? It is also not clear how well the model fits the data.

    Thanks for these comments which helped us improve this section. We have provided more details about the stringency index in methods (it captures the “sum” of interventions), described the model in methods and supplement, and discussed limitations in caveats section, especially regarding effectiveness of these interventions (p13). We had tried different linear models with and without intercepts but elected to use models with intercepts so as not to overly constrain the relationship between interventions, COVID19 activity and excess mortality. These models also incorporate lags in the predictors that are determined by cross-correlation analysis (as detailed in supplement). In the revised version, we now use gam models, where the relationships between excess mortality and predictors do not have to be linear. We can do so since we were able to add several weeks of data (the regression is now based on 96 pandemic weeks from March 1, 2020 to January 1, 2022). The models are described in detail in supplement p 4-5, and we now specify that they have intercepts. We have also provided additional plots of model fits in main text and supplement (Figures 4 and S16-19).

    • Related to the above, please provide more details on how the results of the regressions were translated into the results presented. The main text reports percentages, but the methods only briefly explain how numbers of direct deaths were calculated, and the supplementary tables report coefficients. It is not clear if these estimates of direct and indirect deaths were somehow constrained to add up to the total number of excess deaths, but it doesn't seem like it since point estimates cross 100% in some cases.

    As discussed in response to one of the editor’s questions, estimates are not constrained to 100%. We have provided more details in the supplement on how we estimate the direct impact of the pandemic. Briefly, we calculate expected deaths in the gam model with all predictors set to their observed values and again with the COVID19 predictor to zero. The direct impact is the difference between the two predictions, divided by the predictions of the full model.

    We note that while some of the estimates derived from gam model exceed 100% (and are similar to the linear model estimates presented in the initial analysis, before revision), these estimates echo the findings from a more empirical analysis, in which we compare all-cause excess deaths with official COVID19 deaths tallies. There, in the two oldest age groups, we find more official COVID19 deaths than estimated by the excess mortality models. Hence both analyses point to an underestimation of the direct burden of COVID19 by the excess mortality approach, specific to the oldest age groups. We return to this point in depth in the discussion (p 12-13) and consider the possible effects of harvesting, depressed circulation of non-SARS pathogens, and inaccurate coding of official statistics (as pointed by reviewer #3).

    • Please discuss the potential limitations of using the stringency index to quantify NPIs.

    Several limitations have been added to caveats (p 13); major issues include aggregation of multiple interventions into a single index, which does not consider the actual implementation nor the effect of interventions. The index is solely based on mandates in place in different locations and time periods. We also assume that the effectiveness of these interventions, for a given level of stringency, does not change over time.

    • When estimating direct and indirect effects, the paper assumes that the estimated parameter is time-invariant? Indirect effects might have changed over the course of the epidemic by factors not necessarily captured by the stringency index used, particularly since the index doesn't take into account the implementation of the measures. Have the authors tested this assumption?

    This is an interesting point, which we have explored further. The non-linear relationships we find between NPIs and chronic condition excess mortality may suggest that the reviewer is right. We discuss the role of NPIs in the results section much more deeply than we were previously (bottom of p8).

    “At lower levels of interventions (Oxford index between 0 and 50), representing the early stages of the lockdown in March 2020, excess mortality rose with interventions. Later in the pandemic, increased interventions were estimated to have a beneficial effect on excess mortality, driven by comparison between the period when interventions were strengthened in response to increasing COVID19 activity in late 2020 (Oxford index above 60) to the period when interventions were relaxed in 2021 (Oxford index between 50 and 60).”

    We cannot run an analysis over different time windows because NPI and time are highly conflated (for instance NPI rise from 0-50% in the very early part of the lockdown period, and then stays above 50% for the rest of the study, so we cannot compare the effect of a 25% level in 2020 and 2021). We have added this limitation in the caveat section p.13.

    • The authors state "In contrast, the indirect impact of the pandemic measured by the intervention term was highest in youngest age groups, decreased with age, and lost significance in individuals above 65 years" - I'm not entirely sure of where this statement comes from? For example Table S3 suggests that the indirect effect (multivariate or univariate) is higher in 25-64 yo than in <25s? The same table also suggests negative impacts (protective effects?) in >75s in the multivariate model. Please clarify.

    There are fewer deaths in the under 25 yo so this is why the coefficients were lower overall in table S3. Yet we find that the proportion of variance explained by interventions is higher in the under 25 yrs than in 25-44 yrs.

    We have now changed our modeling strategy to use gam so Table S3 is no longer relevant but the main conclusion that interventions explain a larger relative portion of excess mortality in the under 25 yrs than in the other age groups, and than other covariates, remains valid. The NPI term is now significant is in all groups (although the relative contribution of NPI still declines with age, as in the prior analysis), so we have rephrased this sentence: “In contrast, the relative contribution of indirect effects, via the intervention variable, was highest in youngest age groups and decreased with age”.

    • How do the authors interpret "Percents of excess deaths" over 100%? Similarly, I don't fully understand how to interpret "The upper bound of the 95% confidence interval for heart diseases was above 100% (158%), suggesting that for every excess death from heart disease estimated by our model, up to 1.58 death from heart disease could be directly linked to SARS-CoV-2 infection.

    We have rephrased this section although the overall conclusions remain unchanged. GAM estimates of the direct COVID 19 impact is statistically significantly above 100% in the 85 yo and over, suggesting that our excess mortality approach is too conservative and does not estimate enough COVID19 excess deaths in this age group. We draw a similar conclusion from a more empirical analysis, in which we compare all-cause excess death estimates with official COVID19 deaths tallies. In this analysis, we find more official COVID19 deaths than estimated by the excess mortality models in the two oldest age groups (point estimates above 100% in the 75-84 and 85+ yrs). Hence both analyses point to an underestimation of the direct burden of COVID19 in the oldest age groups by excess mortality approaches.

    Rephrased results section bottom of p.9: “We estimate that the direct contribution of COVID-19 to excess mortality increases with age, from negative and non-statistically significant in individuals under 25 yrs to over 100% in those over 85 years, echoing the gradient seen in official statistics (Table 4). It is also worth noting that our excess mortality estimates may be too conservative (too high) as we did not account for missed circulation of endemic pathogens. This could explain why our estimates of direct COVID-19 contribution exceed 100% in the oldest age group.“

    We return to this point in depth in the discussion and consider the possible effects of harvesting and depressed circulation of non SARS pathogens (p 12-13).

    • Table 3: The signs of the point estimate vs CI for vehicle accidents are inconsistent.

    Thanks, this was a typo. It should have been 4300 (-700, 9300) excess deaths from accidents. This has been updated with more recent data.

    Reviewer #3 (Public Review):

    Authors examine mortality data in the US and use time-series approaches to estimate excess mortality during the COVID-19 pandemic.

    Major comments:

    I would encourage authors to discuss the two different concepts of excess mortality:

    (#1) what deaths were caused, directly or indirectly, by the pandemic. This is what the authors have aimed to assess, and I have no major concerns with the methodology

    (#2) how many additional deaths occurred during the pandemic, compared to what would have been expected in the absence of a pandemic. For such an analysis I think expected annual influenza deaths should be added back to the baseline (or subtracted from the excess)? Some of the discussion seems to relate more to an impression of #2 rather than #1 but I would be interested in the authors' thoughts.

    We have added more details about the approach, in particular why we think that #1 is the proper analysis here (see methods p 5). Given the sheer magnitude of COVID19 excess deaths (over 1 million excess deaths at the end of our study), adding back influenza deaths (up to 52,000 deaths in a recent severe season with a mismatched vaccine, as in 2017-18) would not make a large difference. We have also provided a more direct comparison of the impact of influenza and COVID19.

    1. Authors estimate fewer excess COVID deaths in the elderly than there were confirmed deaths (Table 3). Could this be an indication of some confirmed deaths being "deaths with COVID" rather than "deaths from COVID"? I'm not sure how to interpret the %s in the final column when they exceed 100%. The authors suggested a harvesting effect but I would suggest "deaths with COVID" might be a more likely explanation? This issue can be a limitation of confirmed-death data.

    This is a good point. We have added a comment along these lines in discussion in the middle of p 12. Still, we think harvesting and/or the depressed circulation of endemic pathogens, which would have inflated our baseline, are more likely explanations for these findings. This is because we find similar estimates (exceeding 100%) in gam models that ignore official statistics and rely on COVID19 case data, or COVID19 hospital occupancy data, and this suggests that other mechanisms, beyond coding of official mortality statistics, are at play.

    Yet, as more detailed official statistics become available, a tabulation of confirmed deaths by presence of a primary vs secondary COVID (U07) code may be revealing and get more directly at the reviewer’s question.

  2. Evaluation Summary:

    The authors examine the impacts of the COVID-19 pandemic on excess mortality in the US up to April 30, 2021. The authors separate direct impacts (caused by COVID-19, coded as such or not) of the pandemic from indirect impacts (disruptions), finding that most excess deaths (90%) are due to direct impacts. Importantly, the authors find that the official COVID-19 death tally is an undercount of these deaths. Moreover, the authors also find that excess deaths due to other causes are the main driver of excess mortality among younger populations. The paper is interesting and well written, although we have some concerns, particularly around the estimation of direct vs. indirect impacts.

    (This preprint has been reviewed by eLife. We include the public reviews from the reviewers here; the authors also receive private feedback with suggested changes to the manuscript. The reviewers remained anonymous to the authors.)

  3. Reviewer #1 (Public Review):

    This is a very interesting paper trying to quantify excess deaths due to the COVID-19 pandemic in the USA. The paper is roughly divided into two main sections. In the first section, the authors estimate age and cause-specific excess mortality. In the second section, using their excess mortality estimates, the authors attempt to disentangle the impact of SARS-CoV-2 infection (direct impact) vs. the impact of NPIs on this excess mortality (indirect impact). I have some concerns, particularly with respect to the second section.

    The model used to estimate excess mortality is quite clear. The authors adjust the baseline model to account for low influenza circulation (and deaths) during the COVID-19 pandemic, to avoid underestimating the number of deaths caused by COVID-19. While this makes sense if the authors are trying to estimate the total number of deaths caused by COVID-19, I'm not sure it needs to be accounted for if the authors want to estimate excess/added deaths. A counterfactual scenario would've included influenza. It also raises the question of whether (conceptually) they should be adjusting for other causes of deaths that may have also decreased during the pandemic. The authors briefly acknowledge this in the discussion ("we can't account for changes in baseline respiratory mortality due to depressed circulation of endemic pathogens other than influenza") but my comment goes beyond respiratory diseases. Analyses of excess mortality from other settings have suggested, for example, decreased deaths due to fewer traffic accidents (not in the US) or due to decreased air pollution, and not accounting for these would also lead to an underestimate of the total deaths caused by COVID-19. I understand that it is not feasible to account for all potential factors, so I wonder if they should focus on reporting excess deaths as compared to a counterfactual with influenza.

    The second section, trying to estimate direct vs. indirect effects is also very interesting. However, more details are required about the regression model used and, importantly, what the assumptions and limitations of the approach are. Specifically:

    - Please provide a bit more information on the regression used for direct vs. indirect effects. I'd like to see explicit discussion of the assumptions and limitations of the approach but also of the stringency index used. Does this model include an intercept? Was the association between stringency index and excess deaths assumed to be linear? Or were different functional forms considered? It is also not clear how well the model fits the data.
    - Related to the above, please provide more details on how the results of the regressions were translated into the results presented. The main text reports percentages, but the methods only briefly explain how numbers of direct deaths were calculated, and the supplementary tables report coefficients. It is not clear if these estimates of direct and indirect deaths were somehow constrained to add up to the total number of excess deaths, but it doesn't seem like it since point estimates cross 100% in some cases.
    - Please discuss the potential limitations of using the stringency index to quantify NPIs.
    - When estimating direct and indirect effects, the paper assumes that the estimated parameter is time-invariant? Indirect effects might have changed over the course of the epidemic by factors not necessarily captured by the stringency index used, particularly since the index doesn't take into account the implementation of the measures. Have the authors tested this assumption?
    - The authors state "In contrast, the indirect impact of the pandemic measured by the intervention term was highest in youngest age groups, decreased with age, and lost significance in individuals above 65 years" - I'm not entirely sure of where this statement comes from? For example Table S3 suggests that the indirect effect (multivariate or univariate) is higher in 25-64 yo than in <25s? The same table also suggests negative impacts (protective effects?) in >75s in the multivariate model. Please clarify.
    - How do the authors interpret "Percents of excess deaths" over 100%? Similarly, I don't fully understand how to interpret "The upper bound of the 95% confidence interval for heart diseases was above 100% (158%), suggesting that for every excess death from heart disease estimated by our model, up to 1.58 death from heart disease could be directly linked to SARS-CoV-2 infection.
    - Table 3: The signs of the point estimate vs CI for vehicle accidents are inconsistent.

  4. Reviewer #2 (Public Review):

    In this paper, the authors examine the impacts of the COVID-19 pandemic on excess mortality in the US up to April 30, 2021. The authors separate direct impacts (caused by COVID-19, coded as such or not) of the pandemic from indirect impacts (disruptions), finding that most excess deaths (90%) are due to direct impacts. Importantly, the authors find that the official COVID-19 death tally is an undercount of these deaths. Moreover, the authors also find that excess deaths due to other causes are the main driver of excess mortality among younger populations.

    This study's strength includes the use of whole population surveillance data (vital registration) for a long period of time, the modeling of excess mortality that allows getting a more accurate estimate of the impacts of the pandemic, and the attempt to separate direct from indirect impacts. The key weaknesses are a lack of consideration of the challenges of using vital registration data and a lack of explicit outlining of causal assumptions for the part of the paper examining the effects of non-pharmaceutical interventions.

    While I do believe that the authors have achieved their aims, especially in determining the number of excess deaths (all cause and by cause, and by age and state). The part about separating direct from indirect impacts may require some extra work in outlining assumptions or clarifying methods in order for it to be fully appraisable.

    This is an important manuscript and study, given the complicated nature of coding causes of death (especially with a new disease), allowing for a more precise estimation of the impacts of the pandemic. If the authors address issues related to the direct vs indirect impacts, I believe that part can also be very important, as it would allow for better differentiation of the effects of COVID-19 vs the effects of human action.

  5. Reviewer #3 (Public Review):

    Authors examine mortality data in the US and use time-series approaches to estimate excess mortality during the COVID-19 pandemic.

    Major comments:

    I would encourage authors to discuss the two different concepts of excess mortality:
    (#1) what deaths were caused, directly or indirectly, by the pandemic. This is what the authors have aimed to assess, and I have no major concerns with the methodology
    (#2) how many additional deaths occurred during the pandemic, compared to what would have been expected in the absence of a pandemic. For such an analysis I think expected annual influenza deaths should be added back to the baseline (or subtracted from the excess)? Some of the discussion seems to relate more to an impression of #2 rather than #1 but I would be interested in the authors' thoughts.

    2. Authors estimate fewer excess COVID deaths in the elderly than there were confirmed deaths (Table 3). Could this be an indication of some confirmed deaths being "deaths with COVID" rather than "deaths from COVID"? I'm not sure how to interpret the %s in the final column when they exceed 100%. The authors suggested a harvesting effect but I would suggest "deaths with COVID" might be a more likely explanation? This issue can be a limitation of confirmed-death data.

  6. SciScore for 10.1101/2022.02.10.22270721: (What is this?)

    Please note, not all rigor criteria are appropriate for all manuscripts.

    Table 1: Rigor

    NIH rigor criteria are not applicable to paper type.

    Table 2: Resources

    Antibodies
    SentencesResources
    We used data on the proportion of the population with SARS-CoV-2 antibodies to the nucleocapsid by late April 2021 to compare with our excess death estimates at the end of April 2021, given a similar delay between infection and death and infection and rise in antibodies.
    SARS-CoV-2
    suggested: None

    Results from OddPub: Thank you for sharing your code and data.


    Results from LimitationRecognizer: We detected the following sentences addressing limitations in the study:
    Our study is subject to several limitations. First, mortality counts below the minimum cut-off value of 10 were suppressed due to privacy regulations. As a result, our age-specific analyses are restricted to larger states, and we could not assess the role of race/ethnicity. Prior work has shown important disparities in COVID-19 impact by race/ethnicity and economic status (Mena et al., 2021; Rossen, 2021) in the US and abroad. Second, official coding practices may have changed between states and through time based on SARS-CoV-2 testing availability, location of death, demographic factors, and comorbidities. Third, we find periods of negative excesses in cancer (throughout the pandemic), cardiovascular, and heart diseases (fall 2020), possibly due to changes in ascertainment of underlying cause of death (e.g. a death in a cancer patient with COVID-19 is ascribed to COVID-19) or harvesting (Saha et al., 2013). As discussed earlier, harvesting could also have affected estimates in oldest age groups. Similarly, we can’t account for changes in baseline respiratory mortality due to depressed circulation of endemic pathogens other than influenza. Finally, our study ends in April 2021 and does not capture a recrudescence of COVID19-related deaths due to the more transmissible Delta variant, primarily in states with low vaccine coverage, nor do we estimate the impact of the Omicron immune escape variant. As a result, our excess mortality estimates should be deemed conservative. Pandem...

    Results from TrialIdentifier: No clinical trial numbers were referenced.


    Results from Barzooka: We did not find any issues relating to the usage of bar graphs.


    Results from JetFighter: We did not find any issues relating to colormaps.


    Results from rtransparent:
    • Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
    • Thank you for including a funding statement. Authors are encouraged to include this statement when submitting to a journal.
    • No protocol registration statement was detected.

    Results from scite Reference Check: We found no unreliable references.


    About SciScore

    SciScore is an automated tool that is designed to assist expert reviewers by finding and presenting formulaic information scattered throughout a paper in a standard, easy to digest format. SciScore checks for the presence and correctness of RRIDs (research resource identifiers), and for rigor criteria such as sex and investigator blinding. For details on the theoretical underpinning of rigor criteria and the tools shown here, including references cited, please follow this link.