Large Language Models for Mining Biobank-Derived Insights into Health and Disease
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Large Language Models (LLMs) offer transformative potential for analysing biobank-derived datasets, facilitating knowledge extraction, patient stratification, and predictive modelling. This study benchmarks multiple LLMs in retrieving biomedical insights from a leading biobank, the UK Biobank. UK Biobank-related literature is used as gold standard for assessing coverage and retrieval of some of the best known LLMs, including GPT, Claude, Gemini, Mistral, Llama and DeekSeek. The findings highlight each model’s strengths and limitations, emphasising challenges in data heterogeneity and accessibility. We suggest future research should take advantage of the power of LLMs for enhanced precision in biobank knowledge extraction.