An interpretable machine learning model for assessing the risk of Talaromycosis in HIV patients lacking skin lesions

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Introduction: The existing predictive models for talaromycosis in HIV-infected patients without skin lesions are limited by established risk factors and traditional statistical approaches. This study aims to develop an interpretable machine learning (ML) model for predicting the risk of talaromycosis in HIV patients without skin lesions and to validate its clinical applicability. Methods This retrospective multicenter study involved the analysis of electronic medical records (EMR) from four tertiary hospitals in China, covering the period from 2010 to 2019. The training dataset comprised 1,009 HIV patients with opportunistic infections, while external validation was conducted using data from 305 patients at an independent center. From an initial set of 36 variables, twelve key features were selected, including albumin, absolute lymphocyte count, hemoglobin, alanine aminotransferase (ALT), aspartate aminotransferase (AST), AST/ALT ratio, C-reactive protein, white blood cell count, platelet count, peripheral or abdominal lymphadenopathy, CD4 + T-cell count, and age. Five ML algorithms were evaluated using 10-fold cross-validation. Model performance was measured using the AUC, ACC, and F1-score. Calibration curves and decision curve analysis (DCA) were employed to assess the model's reliability and clinical net benefit. The optimal model was subsequently implemented as a web-based tool. Results The Support Vector Machine (SVM) exhibited superior performance compared to other models, achieving an AUC of 0.809 (95% CI: 0.778–0.838), an ACC of 0.714, and an F1-score of 0.689. External validation demonstrated enhanced performance metrics, with an AUC of 0.921 (95% CI: 0.889–0.951), ACC of 0.853, and an F1-score of 0.819. DCA indicated a significant net clinical benefit across various risk thresholds, and calibration curves showed strong concordance between predicted and observed risks. Conclusion This interpretable SVM model effectively stratifies the risk of talaromycosis in HIV patients in endemic regions, aligning with WHO recommendations for targeted prophylaxis. Its integration into a web-based tool enhances clinical accessibility for early intervention in resource-constrained settings. Ongoing prospective trials (ChiCTR1900021195) are anticipated to further substantiate its real-world impact.

Article activity feed