Evaluating Enhanced LLMs for Precise Mental Health Diagnosis from Clinical Notes
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Anxiety, depression, and other mental health conditions are affecting millions of people worldwide each year. However, limited access to mental health professionals and the stigma surrounding mental illness often deter individuals from seeking help. Many areas, especially rural and underserved communities, face a significant shortage of mental health professionals, making it difficult for individuals to access timely support and treatment. Traditional therapy can be expensive, time-consuming, and intimidating, discouraging individuals from seeking care and delaying essential treatment. The goal of this project is to harness the power of large language models
In the medical domain, large language models (LLMs) have the potential to significantly enhance clinical practice by assisting with tasks such as diagnostic support, therapeutic interventions, and summarization. However, these models often generate inaccurate responses, or “hallucinations,” when faced with queries they cannot effectively handle, raising concerns in the medical community. To address the limitations of LLMs, Retrieval-Augmented Generation (RAG) was leveraged to enhance their performance. By integrating external knowledge sources such as ICD-10-CM guidelines and psychiatric diagnostic manuals, RAG enables LLMs to retrieve relevant information in real time to support their predictions. This study examines whether LLMs can understand and accurately predict mental health-related medical codes from clinical notes. These codes are crucial for clinical documentation and treatment planning. We tested several LLMs (e.g., GPT, LLaMA, Gemini-Pro) enhanced with reliable resources like ICD-10-CM guidelines to evaluate their ability to identify and understand mental health terms and ICD-10-CM codes in psychiatric clinical notes. Our findings reveal that current models lack a robust understanding of the meaning and nuances of these codes, limiting their reliability for mental health applications. This underscores the need for improved strategies to represent and integrate these complex alphanumeric codes within LLMs. Enhancing their capability to accurately process mental health terminologies would make LLMs more reliable and trustworthy tools for mental health professionals, ultimately supporting better care and outcomes for patients.