Semi-automated surveillance of surgical site infections: development of machine learning models and comparison with a rule-based classification model

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Background Surveillance of surgical site infections (SSI) is essential for infection prevention, but traditional methods are labour-intensive. Semi-automated approaches including the use of machine learning (ML) may improve efficiency. We aimed to develop and evaluate ML and rule-based classification models for the semi-automated surveillance of deep/organ-space SSI. Methods We analysed data from a prospective SSI surveillance cohort of adult patients (≥18 years) undergoing cardiac, coronary artery bypass grafting, colorectal surgery, laminectomy, or spinal fusion. Surveillance was performed according to national guidelines using the CDC definition of SSI. Several ML models were trained using electronic health record data. We trained several ML models using data extracted from electronic health records. Model performance was evaluated using an 80/20 dataset split and five-fold cross-validation within the training set. Sensitivity analyses assessed the effect of omitting structured and unstructured features. Results A total of 3,931 patients were included, with an overall deep/organ-space SSI rate of 4.5%. The best-performing ML models (Naïve Bayes and deep neural network) achieved sensitivity up to 90.0%, AUROC values up to 96.8%, and workload reductions exceeding 90%. The rule-based model achieved higher sensitivity (95.4%) but lower AUROC (85.9%) and workload reduction (70.0%). Omitting structured variables had little impact on ML performance, while excluding infection-related keywords significantly reduced sensitivity in the Naïve Bayes model. Conclusions Both ML and rule-based models can support efficient, semi-automated SSI surveillance. While ML offers superior AUROC and workload reduction, rule-based classification showed greater sensitivity, thus reducing the risk of missed infections.

Article activity feed