A Machine Learning Approach for Mortality Prediction in COVID-19 Pneumonia: Development and Evaluation of the Piacenza Score

Geza Halasz
Michela Sperti
Matteo Villani
Umberto Michelucci
Piergiuseppe Agostoni
Andrea Biagi
Luca Rossi
Andrea Botti
Chiara Mari
Marco Maccarini
Filippo Pura
Loris Roveda
Alessia Nardecchia
Emanuele Mottola
Massimo Nolli
Elisabetta Salvioni
Massimo Mapelli
Marco Agostino Deriu
Dario Piga
Massimo Piepoli

This article has been Reviewed by the following groups

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

Evaluated articles (ScreenIT)

Abstract

Several models have been developed to predict mortality in patients with COVID-19 pneumonia, but only a few have demonstrated enough discriminatory capacity. Machine learning algorithms represent a novel approach for the data-driven prediction of clinical outcomes with advantages over statistical modeling.

Objective

We aimed to develop a machine learning–based score—the Piacenza score—for 30-day mortality prediction in patients with COVID-19 pneumonia.

Methods

The study comprised 852 patients with COVID-19 pneumonia, admitted to the Guglielmo da Saliceto Hospital in Italy from February to November 2020. Patients’ medical history, demographics, and clinical data were collected using an electronic health record. The overall patient data set was randomly split into derivation and test cohorts. The score was obtained through the naïve Bayes classifier and externally validated on 86 patients admitted to Centro Cardiologico Monzino (Italy) in February 2020. Using a forward-search algorithm, 6 features were identified: age, mean corpuscular hemoglobin concentration, PaO2/FiO2 ratio, temperature, previous stroke, and gender. The Brier index was used to evaluate the ability of the machine learning model to stratify and predict the observed outcomes. A user-friendly website was designed and developed to enable fast and easy use of the tool by physicians. Regarding the customization properties of the Piacenza score, we added a tailored version of the algorithm to the website, which enables an optimized computation of the mortality risk score for a patient when some of the variables used by the Piacenza score are not available. In this case, the naïve Bayes classifier is retrained over the same derivation cohort but using a different set of patient characteristics. We also compared the Piacenza score with the 4C score and with a naïve Bayes algorithm with 14 features chosen a priori.

Results

The Piacenza score exhibited an area under the receiver operating characteristic curve (AUC) of 0.78 (95% CI 0.74-0.84, Brier score=0.19) in the internal validation cohort and 0.79 (95% CI 0.68-0.89, Brier score=0.16) in the external validation cohort, showing a comparable accuracy with respect to the 4C score and to the naïve Bayes model with a priori chosen features; this achieved an AUC of 0.78 (95% CI 0.73-0.83, Brier score=0.26) and 0.80 (95% CI 0.75-0.86, Brier score=0.17), respectively.

Conclusions

Our findings demonstrated that a customizable machine learning–based score with a purely data-driven selection of features is feasible and effective for the prediction of mortality among patients with COVID-19 pneumonia.

Version published to 10.2196/29058
May 31, 2021
Version published to 10.2196/preprints.29058
Mar 29, 2021

SciScore for 10.1101/2021.03.16.21253752: (What is this?)

Please note, not all rigor criteria are appropriate for all manuscripts.

Table 1: Rigor

Institutional Review Board Statement	not detected.
Randomization	Derivation and test cohorts: The available EHR of 852 patients was randomly split in derivation (70%) and test (30%) cohorts.
Blinding	not detected.
Power Analysis	not detected.
Sex as a biological variable	Pregnant women, children (<18 years) and patients with negative RT-PCR assay were excluded from the study as well as patients presenting with shock and coma.

Table 2: Resources

Software and Algorithms
Sentences	Resources
The overall implementation of all codes for the machine learning score and analysis tools was performed in Python 3.7.4 environment.	Python suggested: (IPython, RRID:SCR_001658)

Results from OddPub: We did not detect open data. …

SciScore for 10.1101/2021.03.16.21253752: (What is this?)

Please note, not all rigor criteria are appropriate for all manuscripts.

Table 1: Rigor

Institutional Review Board Statement	not detected.
Randomization	Derivation and test cohorts: The available EHR of 852 patients was randomly split in derivation (70%) and test (30%) cohorts.
Blinding	not detected.
Power Analysis	not detected.
Sex as a biological variable	Pregnant women, children (<18 years) and patients with negative RT-PCR assay were excluded from the study as well as patients presenting with shock and coma.

Table 2: Resources

Software and Algorithms
Sentences	Resources
The overall implementation of all codes for the machine learning score and analysis tools was performed in Python 3.7.4 environment.	Python suggested: (IPython, RRID:SCR_001658)

Results from OddPub: We did not detect open data. We also did not detect open code. Researchers are encouraged to share open data when possible (see Nature blog).

Results from LimitationRecognizer: An explicit section about the limitations of the techniques employed in this study was not found. We encourage authors to address study limitations.

Results from TrialIdentifier: No clinical trial numbers were referenced.

Results from Barzooka: We did not find any issues relating to the usage of bar graphs.

Results from JetFighter: We did not find any issues relating to colormaps.

Results from rtransparent:

Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
Thank you for including a funding statement. Authors are encouraged to include this statement when submitting to a journal.
No protocol registration statement was detected.

Read the original source

Version published to 10.1101/2021.03.16.21253752 on medRxiv
Mar 20, 2021

Development and validation of an Explainable Machine Learning Model for Predicting Multiple Organ Failure in Patients with Acute Pancreatitis: a Multicenter Cohort Study

This article has 7 authors:
1. Yi Hao
2. Peiyi Bai
3. Yunpeng Zhou
4. Yi Wang
5. Qinyang Du
6. Rongshen Guan
7. Gaopeng Li
This article has no evaluationsLatest version Dec 22, 2025
Machine learning prediction and interpretive analysis of multidrug-resistant microbial infection risk in septicemia patients: A study from the MIMIC-IV database

This article has 5 authors:
1. Qianqian Zhang
2. Nianzhi Zhang
3. Ying Zheng
4. Jing Zhou
5. Ling Liu
This article has no evaluationsLatest version Dec 30, 2025
Construction of Predictive Models for Interstitial Lung Disease Risk in Sjögren’s Syndrome via Multiple Machine Learning Algorithms

This article has 3 authors:
1. qian hui li
2. xinyu sun
3. yueyue chen
This article has no evaluationsLatest version Feb 3, 2026

This article has been Reviewed by the following groups

Discuss this preprint

Listed in

Abstract

Objective

Methods

Results

Conclusions

Article activity feed

Related articles

Development and validation of an Explainable Machine Learning Model for Predicting Multiple Organ Failure in Patients with Acute Pancreatitis: a Multicenter Cohort Study

Machine learning prediction and interpretive analysis of multidrug-resistant microbial infection risk in septicemia patients: A study from the MIMIC-IV database​

Construction of Predictive Models for Interstitial Lung Disease Risk in Sjögren’s Syndrome via Multiple Machine Learning Algorithms

Machine learning prediction and interpretive analysis of multidrug-resistant microbial infection risk in septicemia patients: A study from the MIMIC-IV database