A Predictive Model for Severe COVID-19 in the Medicare Population: A Tool for Prioritizing Primary and Booster COVID-19 Vaccination

This article has been Reviewed by the following groups

Read the full article See related articles

Abstract

Recommendations for prioritizing COVID-19 vaccination have focused on the elderly at higher risk for severe disease. Existing models for identifying higher-risk individuals lack the needed integration of socio-demographic and clinical risk factors. Using multivariate logistic regression and random forest modeling, we developed a predictive model of severe COVID-19 using clinical data from Medicare claims for 16 million Medicare beneficiaries and socio-economic data from the CDC Social Vulnerability Index. Predicted individual probabilities of COVID-19 hospitalization were then calculated for population risk stratification and vaccine prioritization and mapping. The leading COVID-19 hospitalization risk factors were non-white ethnicity, end-stage renal disease, advanced age, prior hospitalization, leukemia, morbid obesity, chronic kidney disease, lung cancer, chronic liver disease, pulmonary fibrosis or pulmonary hypertension, and chemotherapy. However, previously reported risk factors such as chronic obstructive pulmonary disease and diabetes conferred modest hospitalization risk. Among all social vulnerability factors, residence in a low-income zip code was the only risk factor independently predicting hospitalization. This multifactor risk model and its population risk dashboard can be used to optimize COVID-19 vaccine allocation in the higher-risk Medicare population.

Article activity feed

  1. SciScore for 10.1101/2020.10.28.20219816: (What is this?)

    Please note, not all rigor criteria are appropriate for all manuscripts.

    Table 1: Rigor

    NIH rigor criteria are not applicable to paper type.

    Table 2: Resources

    Software and Algorithms
    SentencesResources
    Medication NDC codes were used to identify active pharmaceutical ingredients, which were grouped by pharmaceutical class by mapping to RxNorm codes.
    RxNorm
    suggested: (RxNorm, RRID:SCR_006645)
    Because the odds ratios derived from logistic regression models are a measure of the association between a given feature (e.g. North American Native ethnicity) and the outcome (e.g. hospitalization), we supplemented our analyses with a random forest machine learning algorithm, which produces computed Feature Importance values (Python, scikit-learn version 0.22.1 with RandomForestClassifier and GridSearchCV packages)21 and provides information about the relative importance of each feature for predicting outcomes for the entire sample.
    Python
    suggested: (IPython, RRID:SCR_001658)
    scikit-learn
    suggested: (scikit-learn, RRID:SCR_002577)
    GridSearchCV
    suggested: None

    Results from OddPub: We did not detect open data. We also did not detect open code. Researchers are encouraged to share open data when possible (see Nature blog).


    Results from LimitationRecognizer: We detected the following sentences addressing limitations in the study:
    One of the main limitations of our study and derived models is that it is only based on the Medicare FFS population, which represents approximately 60% of the total Medicare population (with regional variations ranging from 98% in Alaska to 51% in Minnesota). There is evidence that Medicare Advantage plans tend to enroll beneficiaries who are healthier than Medicare FFS beneficiaries, this difference in health status will limit the generalization of our model to the entire Medicare population. If the model were to be used for vaccine allocation, it could be updated using Medicare Advantage data if it were made available. With the above limitations, the models we have developed provide important information for clinicians and policy makers to consider. Specifically, because the models integrating both socio-economic factors and individual clinical data respond to the recommendations of the NAM for prioritization and allocation of Covid-19 vaccines, they could be used to support planning a vaccination campaign. Figure 5 displays a histogram the distribution of the predicted probabilities of hospitalization for SARS-CoV-2 infected patients, and when such data are mapped, they enable planners to estimate how many high-risk beneficiaries reside in a jurisdiction, and of those, how many are in socially vulnerable areas. Our models identifying individuals at risk for severe Covid-19 could also be used by the Medicare program, in collaboration with state and local health officials, t...

    Results from TrialIdentifier: No clinical trial numbers were referenced.


    Results from Barzooka: We did not find any issues relating to the usage of bar graphs.


    Results from JetFighter: We did not find any issues relating to colormaps.


    Results from rtransparent:
    • Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
    • Thank you for including a funding statement. Authors are encouraged to include this statement when submitting to a journal.
    • No protocol registration statement was detected.

    About SciScore

    SciScore is an automated tool that is designed to assist expert reviewers by finding and presenting formulaic information scattered throughout a paper in a standard, easy to digest format. SciScore checks for the presence and correctness of RRIDs (research resource identifiers), and for rigor criteria such as sex and investigator blinding. For details on the theoretical underpinning of rigor criteria and the tools shown here, including references cited, please follow this link.