A prediction model based on machine learning for diagnosing the early COVID-19 patients
This article has been Reviewed by the following groups
Listed in
- Evaluated articles (ScreenIT)
Abstract
With the dramatically fast spread of COVID-9, real-time reverse transcription polymerase chain reaction (RT-PCR) test has become the gold standard method for confirmation of COVID-19 infection. However, RT-PCR tests are complicated in operation andIt usually takes 5-6 hours or even longer to get the result. Additionally, due to the low virus loads in early COVID-19 patients, RT-PCR tests display false negative results in a number of cases. Analyzing complex medical datasets based on machine learning provides health care workers excellent opportunities for developing a simple and efficient COVID-19 diagnostic system. This paper aims at extracting risk factors from clinical data of early COVID-19 infected patients and utilizing four types of traditional machine learning approaches including logistic regression(LR), support vector machine(SVM), decision tree(DT), random forest(RF) and a deep learning-based method for diagnosis of early COVID-19. The results show that the LR predictive model presents a higher specificity rate of 0.95, an area under the receiver operating curve (AUC) of 0.971 and an improved sensitivity rate of 0.82, which makes it optimal for the screening of early COVID-19 infection. We also perform the verification for generality of the best model (LR predictive model) among Zhejiang population, and analyze the contribution of the factors to the predictive models. Our manuscript describes and highlights the ability of machine learning methods for improving the accuracy and timeliness of early COVID-19 infection diagnosis. The higher AUC of our LR-base predictive model makes it a more conducive method for assisting COVID-19 diagnosis. The optimal model has been encapsulated as a mobile application (APP) and implemented in some hospitals in Zhejiang Province.
Article activity feed
-
SciScore for 10.1101/2020.06.03.20120881: (What is this?)
Please note, not all rigor criteria are appropriate for all manuscripts.
Table 1: Rigor
Institutional Review Board Statement not detected. Randomization We then split processed patients into training(80%) and validation(20%) partitions randomly to train our models. Blinding not detected. Power Analysis not detected. Sex as a biological variable not detected. Table 2: Resources
No key resources detected.
Results from OddPub: We did not detect open data. We also did not detect open code. Researchers are encouraged to share open data when possible (see Nature blog).
Results from LimitationRecognizer: We detected the following sentences addressing limitations in the study:Nevertheless, this study still has several limitations. First of all, the recruited participants are limited to Zhejiang …
SciScore for 10.1101/2020.06.03.20120881: (What is this?)
Please note, not all rigor criteria are appropriate for all manuscripts.
Table 1: Rigor
Institutional Review Board Statement not detected. Randomization We then split processed patients into training(80%) and validation(20%) partitions randomly to train our models. Blinding not detected. Power Analysis not detected. Sex as a biological variable not detected. Table 2: Resources
No key resources detected.
Results from OddPub: We did not detect open data. We also did not detect open code. Researchers are encouraged to share open data when possible (see Nature blog).
Results from LimitationRecognizer: We detected the following sentences addressing limitations in the study:Nevertheless, this study still has several limitations. First of all, the recruited participants are limited to Zhejiang Province, which causes certain regional restrictions in the application of the predictive models. Further extremely concerning about the epidemiological characteristics and nationwide studies are needed to access the generality of the suggested model. Secondly, there is a lack of information on the progression and prognosis of COVID-19 as well as asymptomatic infection cases. Finally, more information of infections should be recruited to improve the accurate of screening model.
Results from TrialIdentifier: No clinical trial numbers were referenced.
Results from Barzooka: We did not find any issues relating to the usage of bar graphs.
Results from JetFighter: We did not find any issues relating to colormaps.
Results from rtransparent:- Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
- Thank you for including a funding statement. Authors are encouraged to include this statement when submitting to a journal.
- Thank you for including a protocol registration statement.
-