Improving the Robustness of Large Language Models in Extracting Social Determinants of Health
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Accurate extraction of Social Determinants of Health (SDOH) from text is crucial for various healthcare applications. While Large Language Models (LLMs) have shown promise in this domain, their generalization across different datasets remains a challenge. To address this, we propose Iterative Prompt Self-Correction (IPSC), a novel training strategy that enables an LLM to iteratively refine prompts for SDOH extraction through a self-supervised feedback mechanism. Our method utilizes an extraction LLM guided by a set of prompts and an evaluation LLM that assesses the quality of the extracted information. The feedback from the evaluation model is then used to automatically refine the prompts for the subsequent iteration. We evaluated IPSC on two distinct datasets, an SDOH corpus and the MIMIC-III clinical database, and compared its performance against several baseline methods, including standard fine-tuning and basic prompt tuning. Both quantitative results, measured by Precision, Recall, and F1-score, and qualitative human evaluations demonstrate that IPSC significantly outperforms the baselines, leading to more accurate and robust SDOH extraction. Our findings highlight the potential of self-supervised prompt optimization for enhancing the universality of LLMs in specialized information extraction tasks.