Predicting missed health care visits during the COVID-19 pandemic using machine learning methods: Evidence from 55,500 individuals from 28 European Countries
This article has been Reviewed by the following groups
Listed in
- Evaluated articles (ScreenIT)
Abstract
Background
The COVID-19 pandemic has led many individuals to miss essential care. Machine-learning models that predict which patients are at greatest risk of missing care visits can help health administrators prioritize retentions efforts towards patients with the most need. Such approaches may be especially useful for efficiently targeting interventions for health systems overburdened by the COVID-19 pandemic.
Methods
We compare the performance of four machine learning algorithms to predict missed health care visits based on common patient characteristics available to most health care providers. We use data from 55,500 respondents of the Survey of Health, Ageing and Retirement in Europe (SHARE) COVID-19 survey (June – September 2020) in conjunction with longitudinal data from waves 1-8 (April 2004 – March 2020). We use stepwise selection, group lasso, random forest and neural network algorithms and employ 5-fold cross-validation to test the prediction accuracy, sensitivity, and specificity of the selected models.
Findings
Within our sample, 15.5% of the respondents reported any missed essential health care visit due to the COVID-19 pandemic. All four machine learning methods perform similarly in their predictive power. When classifying all individuals with a predicted probability for missed care above 17% as at risk of a missed visit, they correctly identify between 41% and 53% of the respondents at risk, while correctly identifying between 74% and 64% of the individuals not at risk. We find that the sensitivity and specificity of the models are strongly related to the risk threshold used to classify individuals; thus, the models can be calibrated depending on users’ resource constraints and targeting approach. All models had an area under the curve around 0.62, indicating that they outperform random prediction.
Interpretation
Pandemics such as COVID-19 require rapid and efficient responses to reduce disruptions in health care. Based on characteristics available to health insurance providers, machine learning algorithms can be used to efficiently target efforts to reduce missed essential care.
Funding
Research in this article is a part of the European Union’s H2020 SHARE-COVID19 project (Grant Agreement No. 101015924).
Article activity feed
-
SciScore for 10.1101/2022.03.01.22271611: (What is this?)
Please note, not all rigor criteria are appropriate for all manuscripts.
Table 1: Rigor
NIH rigor criteria are not applicable to paper type.Table 2: Resources
No key resources detected.
Results from OddPub: We did not detect open data. We also did not detect open code. Researchers are encouraged to share open data when possible (see Nature blog).
Results from LimitationRecognizer: We detected the following sentences addressing limitations in the study:Our study faces several limitations. First of all, our data relies on self-reported information on missed essential health care visits. This might introduce a bias if discrepancies between self-reported and objective missed health care visits are not random. In addition, participants were asked whether they had missed any health care …
SciScore for 10.1101/2022.03.01.22271611: (What is this?)
Please note, not all rigor criteria are appropriate for all manuscripts.
Table 1: Rigor
NIH rigor criteria are not applicable to paper type.Table 2: Resources
No key resources detected.
Results from OddPub: We did not detect open data. We also did not detect open code. Researchers are encouraged to share open data when possible (see Nature blog).
Results from LimitationRecognizer: We detected the following sentences addressing limitations in the study:Our study faces several limitations. First of all, our data relies on self-reported information on missed essential health care visits. This might introduce a bias if discrepancies between self-reported and objective missed health care visits are not random. In addition, participants were asked whether they had missed any health care visits since the onset of the COVID-19 pandemic. This might imply different recall periods for participants, as the onset of the pandemic is not a clear date and varies across locations. Still, given that nearly all European countries imposed the first lockdown within a time span of two weeks,36 and that this lockdown was an unparalleled, significant event, the resulting bias might be low. Similarly, most of the interviews took place within two months. While this might increase the recall period for the later participants, the interviews were conducted between the first and the second COVID-19 wave, in a time of comparatively low infection and death rates, such that most of the missed health care visits are expected to have already taken place before the start of the survey. Finally, we do not include missed health care at specialists, as the data combines specialists and dentists in one item. Given the high share of missed regular dentist check-ups recorded in other studies,37,38 we expect that these contribute to the majority of missed visits in this item, and thus are confident that excluding it leads to a more accurate identification of misse...
Results from TrialIdentifier: No clinical trial numbers were referenced.
Results from Barzooka: We did not find any issues relating to the usage of bar graphs.
Results from JetFighter: We did not find any issues relating to colormaps.
Results from rtransparent:- Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
- Thank you for including a funding statement. Authors are encouraged to include this statement when submitting to a journal.
- No protocol registration statement was detected.
Results from scite Reference Check: We found no unreliable references.
-