Clinical Applications and Limitations of Large Language Models in Nephrology: A Systematic Review

Zoe Unger
Shelly Soffer
Orly Efros
Lili Chan
Eyal Klang
Girish N Nadkarni

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Background

Large Language Models (LLMs) are emerging as promising tools in healthcare. This systematic review examines LLMs’ potential applications in nephrology, highlighting their benefits and limitations.

Methods

We conducted a literature search in PubMed and Web of Science, selecting studies based on Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines. The review focuses on the latest advancements of LLMs in nephrology from 2020 to 2024. PROSPERO registration number: CRD42024550169.

Results

Fourteen studies met the inclusion criteria and were categorized into five key areas of nephrology: Streamlining workflow, disease prediction and prognosis, laboratory data interpretation and management, renal dietary management, and patient education. LLMs showed high performance in various clinical tasks, including managing continuous renal replacement therapy (CRRT) alarms (GPT-4 accuracy 90-94%) for reducing intensive care unit (ICU) alarm fatigue, and predicting chronic kidney diseases (CKD) progression (improved positive predictive value from 6.7% to 20.9%). In patient education, GPT-4 excelled at simplifying medical information by reducing readability complexity, and accurately translating kidney transplant resources. Gemini provided the most accurate responses to frequently asked questions (FAQs) about CKD.

Conclusions

While the incorporation of LLMs in nephrology shows promise across various levels of patient care, their broad implementation is still premature. Further research is required to validate these tools in terms of accuracy, rare and critical conditions, and real-world performance.

Version published to 10.1101/2024.10.30.24316199 on medRxiv
Nov 1, 2024

Evaluating Large Language Models for Translating Caries Guidelines into Clinical Decision Support

This article has 8 authors:
1. Gu Nan
2. Bingxin Fan
3. Yao Yuan
4. Xinliang Duan
5. Sichen Han
6. Zhenyong Tang
7. Jiayu Shen
8. Zilin Wang
This article has no evaluationsLatest version Jan 28, 2026
Benchmarking large language models for cardiovascular risk stratification using clinical vignettes

This article has 11 authors:
1. José Ferreira Santos
2. Regina Brito Duarte
3. Inês Mota
4. Rita Carvalheira Santos
5. José Maria Moreira
6. Joana Campos
7. Nuno André Silva
8. Bernardo Neves
9. Ricardo Ladeiras-Lopes
10. Francisca Leite
11. Helder Dores
This article has no evaluationsLatest version Dec 30, 2025
Large Language Model Biases in Healthcare: A Scoping Review and Call for an Integrated Assessment Framework

This article has 8 authors:
1. Lu He
2. D. Phuong Do
3. Vishesh Girish Shet
4. Omar Farghaly
5. Priya Deshpande
6. Praveen Madiraju
7. Jiancheng Ye
8. Molly Beestrum
This article has no evaluationsLatest version Jan 16, 2026

Discuss this preprint

Listed in

Abstract

Background

Methods

Results

Conclusions

Article activity feed

Related articles

Evaluating Large Language Models for Translating Caries Guidelines into Clinical Decision Support

Benchmarking large language models for cardiovascular risk stratification using clinical vignettes

Large Language Model Biases in Healthcare: A Scoping Review and Call for an Integrated Assessment Framework