Machine Learning for Differentiating Dengue from Chikungunya in Northern Brazil

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Purpose: Dengue and chikungunya, viral diseases spread by Aedes mosquitoes, are prevalent in northern Brazil, where overlapping symptoms hinder accurate diagnosis. This study aims to develop machine learning models to differentiate these diseases, enhancing early management and reducing underreporting in resource-limited settings. Methods: We used clinical symptom data from the Brazilian Notifiable Diseases Information System (SINAN, 2021–2023) to train machine learning models. The dataset comprised 4,874 PCR-confirmed cases among adults (18–59 years), split into training (2021–2022, n=2,437) and testing (2023, n=2,437) sets. Five algo-rithms—Random Forest, XGBoost, LightGBM, CatBoost, and TabPFN 2—were evaluated using AUC-ROC, precision, and recall metrics. Feature importance was analyzed with SHAP and Boruta methods. Results: The Random Forest model performed best, achieving an AUC-ROC of 0.782, precision of 0.734, and recall of 0.733 for dengue. Adjusting the classification threshold to the training prevalence (62.4%) optimized performance, supporting early triage in primary care. Conclusion: Machine learning enhances the sensitivity and efficiency of dengue-chikungunya diagnosis. By leveraging clinical symptoms, these models provide a practical, cost-effective tool for resource-constrained settings, improving arbovirus management.

Article activity feed