Enhanced Language Models for Predicting and Understanding HIV Care Disengagement: A Case Study in Tanzania
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Sustained engagement in HIV care and adherence to ART are crucial for meeting the UNAIDS "95-95-95" targets. Disengagement from care remains a significant issue, especially in sub-Saharan Africa. Traditional machine learning (ML) models have had moderate success in predicting disengagement, enabling early intervention. We developed an enhanced large language model (LLM) fine-tuned with electronic medical records (EMRs) to predict individuals at risk of disengaging from HIV care in Tanzania. Using 4.8 million EMR records from the National HIV Care and Treatment Program (2018–2023), we identified risks of ART non-adherence, non-suppressed viral load, and loss to follow-up. Our enhanced LLM may outperform traditional machine learning models and zero-shot LLMs. HIV physicians in Tanzania evaluated the model’s predictions and justifications, finding 65% alignment with expert assessments, and 92.3% of the aligned cases were considered clinically relevant. This model can support data-driven decisions and may improve patient outcomes and reduce HIV transmission.