Evaluating the performance of an artificial intelligence-based electronic reader for malaria rapid diagnostic tests across four sub-Saharan African countries

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Background The introduction of malaria rapid diagnostic tests (RDTs) has expanded parasitologic confirmation of malaria at all levels of health systems in sub-Saharan Africa (SSA), improving case management and surveillance. However, concerns persist about healthcare worker adherence to results and the accuracy of results recorded in health facility registers. Electronic RDT readers have been proposed to improve the consistency of diagnosis and reporting, though their performance relative to expert human interpretation varies. We assessed the performance of the HealthPulse (Audere, Seattle, WA USA) smartphone application, an artificial intelligence (AI)-based RDT reader, across four countries in SSA. Methods In 2023, the Malaria Rapid Diagnostic Test Capture and Reporting Assessment (MaCRA) was implemented in health facilities in Benin, Côte d’Ivoire, Nigeria, and Uganda. Study staff collected images of malaria RDTs using the HealthPulse app after healthcare workers performed and interpreted the tests. A trained panel of external reviewers interpreted the RDT images, serving as the reference standard. RDTs were classified as positive, negative, invalid or indeterminate. We evaluated classification accuracy using recall, precision, and F1 scores (harmonic mean of recall and precision), and applied logistic regression to assess factors influencing AI performance across countries, RDT products, presence of faint lines and anomalies. Results Out of 110,843 RDT images collected, 110,231 (99.4%) were included in the analysis. The AI algorithm demonstrated high overall accuracy (96.8%) and a F1 score of 96.6% compared to panel interpretations. Recall and precision were >96% for positive and negative outcomes but much lower for invalid (recall: 84.5%; precision: 42.9%) and indeterminate classifications (recall: 0.7%; precision: 2.3%). AI performance varied by country, RDT product, and presence of faint lines. When test lines were faint, the OR of both positive recall (adjusted OR 0.01; 95% CI 0.00, 0.01) and negative recall (adjusted OR 0.20; 95% CI 0.11, 0.35) by the AI algorithm were reduced. Conclusions The HealthPulse AI algorithm demonstrated high agreement with a trained panel in interpreting malaria RDT images across diverse settings. However, reduced performance for invalid and indeterminate results and varying performance by country and RDT product highlights the need for further refinement. The HealthPulse app shows potential as a supportive tool in research and training.

Article activity feed