A NOVEL METHOD FOR HANDLING PRE-EXISTING CONDITIONS IN PREDICTION MODELS FOR COVID-19 DEATH

This article has been Reviewed by the following groups

Read the full article

Abstract

Objective

To derive a predicted probability of death (PDeathDx) based upon complete sets of ICD-10 codes assigned to patients prior to their diagnosis of COVID-19. PDeathDx is intended for use as a summary metric for pre-existing conditions in multivariate models for COVID-19 death.

Methods

Cases were identified through the COVID-19 Shared Data Resource (CSDR) of the Department of Veterans Affairs. The diagnosis required at least one positive nucleic acid amplification test (NAAT). The primary outcome was death within 60 days of the first positive test. We retrieved all diagnoses entered into the electronic medical record for visits, on problem lists, and at the time of hospital discharge if they were at least 14 days prior to the NAAT. ICD-9 codes were converted to ICD-10 equivalents using a crosswalk provided by the Centers for Medicare/Medicaid Services. ICD-10 codes were converted to their category diagnoses defined as all columns to the left of the decimal point. Each patient was considered to have or not have each category diagnosis prior to the NAAT. A computer program calculated the number of cases for each category diagnosis, the relative risk (RR) of death, and its confidence interval (CI) using a Bonferroni adjustment for multiple comparisons. RRs were re-centered by subtracting 1 so that high-risk conditions had a positive value while protective conditions had a negative one. Diagnoses found to be significant were entered into a logistic model for death in a stepwise fashion. Each patient was assigned (RR-1) to each category diagnosis if they had the condition or 0 otherwise. The resulting model was used to derive PDeathDx for each patient and the area under its receiver operating characteristic (ROC) curve calculated. Single variable logistic models were also derived for age at diagnosis, the Charlson 2-year (Charl2Yr) and lifetime (CharlEver) scores, and the Elixhauser 2-year (Elix2Yrs) and lifetime (ElixEver) scores. Stata was used to compare the ROCs for PDeathDx and each of the other metrics.

Results

On September 30, 2021 there were 347,220 COVID-19 patients in the CSDR. 18,120 patients (5.33%) died within 60 days of their diagnosis. After consolidating ICD-9 and ICD-10 codes, 29,162,710 separate diagnoses were given to the subjects representing 41,341 ICD-10 codes. This set was reduced to 1,890 category diagnoses assigned to the group for the first time on 19,184,437 occasions. Of the 1,890 category diagnoses, 425 involved >= 100 subjects and had a lower boundary for the CI >= 1.50 (a high-risk condition) or upper boundary <= 0.80 (a protective condition). Stepwise logistic regression showed that 153 were statistically significant, independent predictors of death. PDeathDx was slightly less powerful than age as a discriminator (ROC = 0.811 +/- 0.002 vs 0.812 +/- 0.001, respectively; P < 0.001) but was superior to the Charl2Yr (ROC = 0.727 +/- 0.002; P < 0.001), CharlEver (ROC = 0.753 +/- 0.002; P <= 0.001), Elix2Yr (ROC = 0.694 +/- 0.002; P < 0.001); and ElixEver (ROC = 0.731 +/- 0.002; P < 0.001). Univariate analysis and multivariate modeling showed that many of the most high-risk conditions are under-represented or not included in the Charlson Index. These include hypertension, dementia, degenerative neurologic disease, or diagnoses associated with severe physical disability.

Conclusions

Our method for handling pre-existing conditions in multivariate analysis has many advantages over conventional comorbidity indices. The approach can be applied to any condition or outcome, can use any categorical predictors including medications, creates its own condition weights, handles rare as well as protective conditions, and returns actionable information to providers. The latter include the specific ICD-10 groups, their contribution to the risk, and their rank order of importance. Finally, PDeathDx is equivalent to age as a discriminator of outcomes and outperforms 4 other comorbidity scores. If validated by others, this approach provides an alternative and more robust approach to handling comorbidities in multivariate models.

Article activity feed

  1. SciScore for 10.1101/2022.01.22.22269694: (What is this?)

    Please note, not all rigor criteria are appropriate for all manuscripts.

    Table 1: Rigor

    Ethicsnot detected.
    Sex as a biological variablenot detected.
    Randomizationnot detected.
    Blindingnot detected.
    Power Analysisnot detected.

    Table 2: Resources

    No key resources detected.


    Results from OddPub: We did not detect open data. We also did not detect open code. Researchers are encouraged to share open data when possible (see Nature blog).


    Results from LimitationRecognizer: We detected the following sentences addressing limitations in the study:
    The major limitation of our approach is that it handles all pre-existing diagnoses – not just the most recent ones. Thus, a person with chronic renal failure (CRF) who undergoes a transplant and regains normal renal function will still be included in the analysis of CRF. Of course, our conclusions are limited to patients with characteristics like the veteran population. Further studies should be done on other populations and disease states before the method should be widely applied. If validated by others, our method could provide a more robust alternative to comorbidity scores for handling pre-existing conditions in multivariate models.

    Results from TrialIdentifier: No clinical trial numbers were referenced.


    Results from Barzooka: We did not find any issues relating to the usage of bar graphs.


    Results from JetFighter: We did not find any issues relating to colormaps.


    Results from rtransparent:
    • Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
    • Thank you for including a funding statement. Authors are encouraged to include this statement when submitting to a journal.
    • No protocol registration statement was detected.

    Results from scite Reference Check: We found no unreliable references.


    About SciScore

    SciScore is an automated tool that is designed to assist expert reviewers by finding and presenting formulaic information scattered throughout a paper in a standard, easy to digest format. SciScore checks for the presence and correctness of RRIDs (research resource identifiers), and for rigor criteria such as sex and investigator blinding. For details on the theoretical underpinning of rigor criteria and the tools shown here, including references cited, please follow this link.