A conversational agent for providing personalized PrEP support - Protocol for chatbot implementation and evaluation

Fatima Sayed
Albert Park
Patrick S. Sullivan
Yaorong Ge

Read the full article

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Background

Low pre-exposure prophylaxis (PrEP) uptake, driven by lack of awareness, misconceptions, and stigma, hinders PrEP adoption and self-efficacy contributing to ongoing HIV incidence. Chatbots have the potential to reduce barriers to HIV treatment and PrEP by providing anonymous and continuous support. However, chatbots are still nascent in the context of HIV and PrEP, lacking personalized informational, experiential expertise, and human-like emotional support. Personalized information can increase engagement by tailoring to individual needs. Peer experience can influence health decisions by providing relatable perspectives, and effective emotional support can foster resilience and individual well-being. To promote PrEP uptake, we develop a retrieval augmented generation (RAG) chatbot for providing personalized information, peer experiential expertise, and emotional support to PrEP candidates.

Objective

In this paper, we describe the iterative development of a RAG chatbot for providing personalized information, peer experiential expertise, and human-like emotional support to PrEP candidates.

Methods

We employ an iterative design process consisting of two phases – prototype conceptualization and iterative chatbot development. In the conceptualization phase, we identify real-world PrEP needs and design a functional dialog flow diagram for PrEP support. Chatbot development includes developing 2 components – a query preprocessor and a RAG module. Preprocessor uses the Segment Any Text (SAT) tool for query segmentation and a Gemma 2 fine-tuned support classifier to identify informational, emotional, and contextual data from real-world queries. To implement the RAG module, we use information retrieval techniques, specifically employing Sentence-BERT (SBERT) embeddings with cosine similarity for semantic similarity and performing topic matching to identify topically relevant documents based on query topic, for document retrieval. Extensive prompt engineering is used to guide the LLM (Gemini-2.0-Flash) in generating tailored responses. We conduct 10 rounds of internal evaluations to assess the chatbot responses based on 10 criteria: clarity, accuracy, actionability, relevancy, information detail, tailored information, comprehensiveness, language suitability, tone, and empathy. The feedback is used to iteratively refine the LLM prompts to enhance quality of chatbot responses.

Results

We developed a RAG chatbot and iteratively refined it based on the internal evaluation feedback. Prompt engineering is essential in guiding the LLM to generate tailored responses. Through our chatbot development, we found that prompt effectiveness varies with task complexity due to LLM sensitivity to prompt structure and linguistic variability. For tasks requiring diverse perspectives, fine-tuning LLMs on annotated datasets provide better results compared to few-shot prompting techniques. Prompt decomposition, segmenting prompt instructions, helps improve comprehensiveness and relevancy in complex and long queries. For tasks with high decision variance, condensed prompts, summarization of the main concept or idea, are more effective in reducing ambiguity in LLM decisions compared to decomposed prompts.

Conclusion

Chatbots have the potential to promote PrEP uptake. Leveraging social media data and RAG techniques, our chatbot can provide personalized information, peer experiences, and human-like emotional support, elements essential for reducing PrEP misconceptions and promoting self-efficacy. Expert and user feedback can help validate and improve the chatbot’s potential.

Version published to 10.1101/2025.05.02.25326894v1 on medRxiv
May 5, 2025