Estimating area-level variation in SARS-CoV-2 infection fatality ratios
This article has been Reviewed by the following groups
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
- Evaluated articles (ScreenIT)
Abstract
Background
During a pandemic, estimates of geographic variability in disease burden are important but limited by the availability and quality of data.
Methods
We propose a framework for estimating geographic variability in testing effort, total number of infections, and infection fatality ratio (IFR). Because symptomatic people are more likely to seek testing, we use a noncentral hypergeometric model that accounts for differential probability of positive tests. We apply this framework to the United States (U.S.) COVID-19 pandemic to estimate county-level SARS-CoV-2 IFRs from March 1, 2020 to October 31, 2020. Using data on population size, number of observed cases, number of reported deaths in each U.S. county and state, and number of tests in each U.S. state, we develop a series of estimators to identify the number of SARS-CoV-2 infections and IFRs at the county level. We then perform a simulation and compare the estimated values to simulated values to demonstrate the validity of our approach.
Findings
Applying the county-level estimators to the real, unsimulated COVID-19 data spanning March 1, 2020 to October 31, 2020 from across the U.S., we found that IFRs varied from 0 to 0.0273, with an interquartile range of 0.0022 and a median of 0.0018. The estimators for IFRs, number of infections, and number of tests showed high accuracy and precision; for instance, when applied to simulated validation data sets, across counties, Pearson correlation coefficients between estimator means and true values were 0.88, 0.95, and 0.74, respectively.
Interpretation
We propose an estimation framework that can be used to identify area-level variation in IFRs and performs well to estimate county-level IFRs in the U.S. COVID-19 pandemic.
Article activity feed
-
SciScore for 10.1101/2021.12.04.21267288: (What is this?)
Please note, not all rigor criteria are appropriate for all manuscripts.
Table 1: Rigor
NIH rigor criteria are not applicable to paper type.Table 2: Resources
No key resources detected.
Results from OddPub: We did not detect open data. We also did not detect open code. Researchers are encouraged to share open data when possible (see Nature blog).
Results from LimitationRecognizer: We detected the following sentences addressing limitations in the study:Similar to all models, our results are dependent on the quality of the data, which has limitations. Nonetheless, an advantage of our approach is that it allows investigators to rely on the best available data and estimate quantities where data may be lacking. Our estimates use data on NAAT (also known as molecular or PCR tests) and do …
SciScore for 10.1101/2021.12.04.21267288: (What is this?)
Please note, not all rigor criteria are appropriate for all manuscripts.
Table 1: Rigor
NIH rigor criteria are not applicable to paper type.Table 2: Resources
No key resources detected.
Results from OddPub: We did not detect open data. We also did not detect open code. Researchers are encouraged to share open data when possible (see Nature blog).
Results from LimitationRecognizer: We detected the following sentences addressing limitations in the study:Similar to all models, our results are dependent on the quality of the data, which has limitations. Nonetheless, an advantage of our approach is that it allows investigators to rely on the best available data and estimate quantities where data may be lacking. Our estimates use data on NAAT (also known as molecular or PCR tests) and do not include data from antigen tests including home tests, which did not receive FDA approval until late 2020. The NAAT are the most reliable for detecting infections and were the most widely used during the study period. Our approach relies on a number of assumptions that can be relaxed to allow for generalization. First, Assumption 1 – that the odds testing an infected vs an uninfected individual is the same for all counties, states, and the entire country – can be relaxed in numerous ways. While some geographic or temporal structure must be assumed to estimate the odds ratios, which are latent variables, this structure does not need to be geographically homogeneous. With additional covariates (e.g., data on demographics, health care access), it may also be possible to further infer specific aspects of the odds ratios. Second, regarding Assumption 2, while an overall IFR must be assumed, this value can in principle be any number between 0 and 1 dependent on the disease, time, location, and other factors. Moreover, when the aim is to rank IFR in subregions relative to each other instead of estimate their precise values, the assumed overall IFR m...
Results from TrialIdentifier: No clinical trial numbers were referenced.
Results from Barzooka: We did not find any issues relating to the usage of bar graphs.
Results from JetFighter: We did not find any issues relating to colormaps.
Results from rtransparent:- Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
- Thank you for including a funding statement. Authors are encouraged to include this statement when submitting to a journal.
- No protocol registration statement was detected.
Results from scite Reference Check: We found no unreliable references.
-
