An ML prediction model based on clinical parameters and automated CT scan features for COVID-19 patients

This article has been Reviewed by the following groups

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Abstract

Outcome prediction for individual patient groups is of paramount importance in terms of selection of appropriate therapeutic options, risk communication to patients and families, and allocating resource through optimum triage. This has become even more necessary in the context of the current COVID-19 pandemic. Widening the spectrum of predictor variables by including radiological parameters alongside the usually utilized demographic, clinical and biochemical ones can facilitate building a comprehensive prediction model. Automation has the potential to build such models with applications to time-critical environments so that a clinician will be able to utilize the model outcomes in real-time decision making at bedside. We show that amalgamation of computed tomogram (CT) data with clinical parameters (CP) in generating a Machine Learning model from 302 COVID-19 patients presenting to an acute care hospital in India could prognosticate the need for invasive mechanical ventilation. Models developed from CP alone, CP and radiologist derived CT severity score and CP with automated lesion-to-lung ratio had AUC of 0.87 (95% CI 0.85–0.88), 0.89 (95% CI 0.87–0.91), and 0.91 (95% CI 0.89–0.93), respectively. We show that an operating point on the ROC can be chosen to aid clinicians in risk characterization according to the resource availability and ethical considerations. This approach can be deployed in more general settings, with appropriate calibrations, to predict outcomes of severe COVID-19 patients effectively.

Article activity feed

  1. SciScore for 10.1101/2022.01.30.22269998: (What is this?)

    Please note, not all rigor criteria are appropriate for all manuscripts.

    Table 1: Rigor

    NIH rigor criteria are not applicable to paper type.

    Table 2: Resources

    Software and Algorithms
    SentencesResources
    Implementation Notes: Computer coding was carried out in Python programming language version 3.828, and compute-resources from Google Colab29 were used.
    Python
    suggested: (IPython, RRID:SCR_001658)
    Deep learning models were implemented using TensorFlow v230 library.
    TensorFlow
    suggested: (tensorflow, RRID:SCR_016345)
    OpenCV v431 and scikit-image v0.1732 were used for image processing.
    OpenCV
    suggested: (OpenCV, RRID:SCR_015526)
    We used random forest model from the scikit-learn v0.2433 library.
    scikit-learn
    suggested: (scikit-learn, RRID:SCR_002577)

    Results from OddPub: We did not detect open data. We also did not detect open code. Researchers are encouraged to share open data when possible (see Nature blog).


    Results from LimitationRecognizer: We detected the following sentences addressing limitations in the study:
    There are several limitations of the study. First, important physiological parameters on respiratory rate and SpO2 on air presentation were not collected due to the retrospective nature of study. This should be added in the subsequent iterations to make the prediction more robust and most likely will make the model significantly better. Second, Data collected from single centre might create bias in the results that may arise from local clinician practice of selection of patients for mechanical ventilation. Third, The dataset was developed during the first wave of COVID-19 in India. The changing nature and virulence of the virus may alter the performance of the model and recalibration may be required in successive waves. Fourth, the patients were not on therapy at the time of acquisition of the CT scan. Therefore, it is not clear whether the model can be applied to patients already admitted to the hospital and who have been given proven therapy (e.g. systemic corticosteroid or IL-6 inhibitors). Fifth, it should be noted that specific lung pathology cannot be differentiated through this method and therefore a radiologist should still view the images from the standpoint of traditional reporting. Finally, there is a need for a strategy to be devised for scans done in the HRCT format, in order to extrapolate the model developed on volume CT scans into high-resolution scans which are done with fewer sections. Furthermore, there is no detail demographic distribution data regarding t...

    Results from TrialIdentifier: No clinical trial numbers were referenced.


    Results from Barzooka: We did not find any issues relating to the usage of bar graphs.


    Results from JetFighter: We did not find any issues relating to colormaps.


    Results from rtransparent:
    • Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
    • Thank you for including a funding statement. Authors are encouraged to include this statement when submitting to a journal.
    • No protocol registration statement was detected.

    Results from scite Reference Check: We found no unreliable references.


    About SciScore

    SciScore is an automated tool that is designed to assist expert reviewers by finding and presenting formulaic information scattered throughout a paper in a standard, easy to digest format. SciScore checks for the presence and correctness of RRIDs (research resource identifiers), and for rigor criteria such as sex and investigator blinding. For details on the theoretical underpinning of rigor criteria and the tools shown here, including references cited, please follow this link.