Domain Shifts in Machine Learning Based Covid-19 Diagnosis From Blood Tests

Abstract

Many previous studies claim to have developed machine learning models that diagnose COVID-19 from blood tests. However, we hypothesize that changes in the underlying distribution of the data, so called domain shifts, affect the predictive performance and reliability and are a reason for the failure of such machine learning models in clinical application. Domain shifts can be caused, e.g., by changes in the disease prevalence (spreading or tested population), by refined RT-PCR testing procedures (way of taking samples, laboratory procedures), or by virus mutations. Therefore, machine learning models for diagnosing COVID-19 or other diseases may not be reliable and degrade in performance over time. We investigate whether domain shifts are present in COVID-19 datasets and how they affect machine learning methods. We further set out to estimate the mortality risk based on routinely acquired blood tests in a hospital setting throughout pandemics and under domain shifts. We reveal domain shifts by evaluating the models on a large-scale dataset with different assessment strategies, such as temporal validation. We present the novel finding that domain shifts strongly affect machine learning models for COVID-19 diagnosis and deteriorate their predictive performance and credibility. Therefore, frequent re-training and re-assessment are indispensable for robust models enabling clinical utility.

SciScore for 10.1101/2021.04.06.21254997: (What is this?)

Please note, not all rigor criteria are appropriate for all manuscripts.

Table 1: Rigor

NIH rigor criteria are not applicable to paper type.

Table 2: Resources

Software and Algorithms
Sentences	Resources
In particular, the model classes RF, KNN and SVM are trained with the scikit-learn package 0.22.1.	scikit-learn suggested: (scikit-learn, RRID:SCR_002577)
XGB is trained with the XGBClassifier from the Python package XGBoost 1.3.1.	Python suggested: (IPython, RRID:SCR_001658)

Results from OddPub: Thank you for sharing your code.

Results from LimitationRecognizer: We detected the following sentences addressing limitations in the study:

One limitation of our work could be that we did not evaluate the generalization of our model to other hospitals. A transfer of a COVID-19 diagnostic model should …

SciScore for 10.1101/2021.04.06.21254997: (What is this?)

Please note, not all rigor criteria are appropriate for all manuscripts.

Table 1: Rigor

NIH rigor criteria are not applicable to paper type.

Table 2: Resources

Software and Algorithms
Sentences	Resources
In particular, the model classes RF, KNN and SVM are trained with the scikit-learn package 0.22.1.	scikit-learn suggested: (scikit-learn, RRID:SCR_002577)
XGB is trained with the XGBClassifier from the Python package XGBoost 1.3.1.	Python suggested: (IPython, RRID:SCR_001658)

Results from OddPub: Thank you for sharing your code.

Results from LimitationRecognizer: We detected the following sentences addressing limitations in the study:

One limitation of our work could be that we did not evaluate the generalization of our model to other hospitals. A transfer of a COVID-19 diagnostic model should only be done with thorough re-assessments, as a domain shift between hospitals might be present. Besides others, such domain shifts from one institution to another could result from different testing strategies, laboratory equipment or demographics of the population in the hospital catchment area. Re-training of models rather than transferring to another hospital should be considered to obtain a skilled and trustworthy model. However, this is not part of our investigation. Our findings and suggestions about domain shifts should be accounted for in all hospitals when applying a COVID-19 model. We evaluate our models on different cohorts to show the high performance as well as to reveal the domain shifts. However, the 2020 cohort only contains subjects that were tested for COVID-19 and where a blood test was taken. Hence, the 2020 cohort only is a subset of the total patient cohort on which the model will be applied. To counteract missing samples from a particular group, we also use the pre-pandemic negatives, which should cover a wide variety of negatives due to the large data set. An evaluation of all blood tests of 2020 just is not possible due to the lack of RT-PCR tests which serve as labels in our ML approach. Non-tested subjects of 2020 cannot be assumed to be negatives, therefore we discard them. This could onl...

Results from TrialIdentifier: No clinical trial numbers were referenced.

Results from Barzooka: We did not find any issues relating to the usage of bar graphs.

Results from JetFighter: We did not find any issues relating to colormaps.

Results from rtransparent:

Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
Thank you for including a funding statement. Authors are encouraged to include this statement when submitting to a journal.
No protocol registration statement was detected.

Read the original source

Domain Shifts in Machine Learning Based Covid-19 Diagnosis From Blood Tests

This article has been Reviewed by the following groups

Discuss this preprint

Listed in

Abstract

Article activity feed

Development and Deployment of a Machine Learning–Based Predictive Model for COVID- 19 Infection Using Patient Demographic and Symptom Data in Nigeria

Machine Learning Analysis of COVID19 Transmission Dynamics Demographic Risk and Contact Tracing Outcomes in Nigeria

Machine Learning-Enabled Diagnosis of Viral Respiratory Infections from Exhaled Volatile Organic Compound Analysis

This article has been Reviewed by the following groups

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Development and Deployment of a Machine Learning–Based Predictive Model for COVID- 19 Infection Using Patient Demographic and Symptom Data in Nigeria

Machine Learning Analysis of COVID19 Transmission Dynamics Demographic Risk and Contact Tracing Outcomes in Nigeria

Machine Learning-Enabled Diagnosis of Viral Respiratory Infections from Exhaled Volatile Organic Compound Analysis