A Deep Learning–Based Imaging Informatics Framework for Automated Detection of Plasmodium Falciparum in Blood Smear Microscopy

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Background Malaria remains a major global health burden, with over 249 million cases reported worldwide in 2022. Light microscopy of peripheral blood smears remains the diagnostic gold standard but is labor-intensive, operator-dependent, and prone to variability, particularly in resource-limited settings. Imaging informatics and deep learning offer the potential to automate and standardize malaria screening workflows. Objective To develop and validate a high-sensitivity convolutional neural network (CNN)–based imaging informatics model for automated classification of segmented Plasmodium falciparum–infected erythrocytes and to benchmark its diagnostic performance against a traditional Random Forest classifier trained on the full high-dimensional pixel feature space. Methods A total of 27,558 segmented erythrocyte images from the NIH Malaria Dataset were used. Images underwent preprocessing and augmentation prior to training a sequential CNN comprising three convolutional layers optimized using the Adam optimizer. For comparison, a Random Forest classifier was rigorously trained on the full pixel-level feature space without spatial feature extraction. Model performance was evaluated on an independent test set (n = 5,511) using accuracy, sensitivity, specificity, negative predictive value (NPV), and area under the receiver operating characteristic curve (AUC). Results The Random Forest classifier demonstrated near-random performance when applied to the full pixel feature space, achieving an accuracy of 49.55% and an AUC of 0.493. In contrast, the CNN achieved an accuracy of 95.50% (95% CI: 94.9–96.1), representing a 45.95% absolute improvement. The CNN demonstrated high sensitivity (96.12%), high NPV (96.07%), and excellent discriminative ability (AUC = 0.986). Conclusion This study demonstrates that deep learning–based imaging informatics substantially outperforms traditional pixel-based machine learning approaches for malaria microscopy classification. The failure of the Random Forest model highlights the necessity of spatial feature extraction in high-dimensional image data. The high sensitivity and NPV of the proposed CNN support its potential role as an automated first-pass screening tool to augment microscopy-based malaria diagnosis, particularly in high-burden and resource-constrained settings.

Article activity feed