Large Language Models for Mining Biobank-Derived Insights into Health and Disease

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Large Language Models (LLMs) offer transformative potential for analysing biobank-derived datasets, facilitating knowledge extraction, patient stratification, and predictive modelling. This study benchmarks multiple LLMs in retrieving biomedical insights from a leading biobank, the UK Biobank. UK Biobank-related literature is used as gold standard for assessing coverage and retrieval of some of the best known LLMs, including GPT, Claude, Gemini, Mistral, Llama and DeekSeek. The findings highlight each model’s strengths and limitations, emphasising challenges in data heterogeneity and accessibility. We suggest future research should take advantage of the power of LLMs for enhanced precision in biobank knowledge extraction.

Article activity feed