One Step Closer to Conversational Medical Records: ChatGPT Parses Psoriasis Treatments from EMRs

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Large Language Models (LLMs), such as ChatGPT, are increasingly used in clinical settings, including documentation and decision support. However, their accuracy in extracting treatment data from unstructured dermatology records remains underexplored. We evaluated ChatGPT-4o’s ability to identify psoriasis treatments from free-text documentation, compared with expert annotations. Ninety-four electronic medical records (EMRs) of psoriasis patients were retrospectively analyzed. ChatGPT-4o extracted treatments from each note, and its output was compared to manually curated annotations by dermatologists. Eighty-three treatments, including topical agents, systemic medications, biologics, phototherapy, and procedures, were evaluated. Performance metrics included recall, precision, F1-score, specificity, accuracy, Cohen’s Kappa, and AUC. ChatGPT-4o demonstrated strong performance, with a recall of 0.91, a precision of 0.96, an F1-score of 0.94, a specificity of 0.99, and an accuracy of 0.99. Agreement with expert annotations was high (Cohen’s Kappa = 0.93; AUC = 0.98). Group-level analysis confirmed these results, with the highest performance in biologics and methotrexate (F1 = 1.00) and lower recall in categories with vague documentation, such as systemic steroids and antihistamines. Our study highlights the potential of LLMs to extract psoriasis treatment information from unstructured clinical documentation and structure it for research and decision support. The model performed best with well-defined, commonly used treatments.

Article activity feed