Leveraging Large Language Models in Pharmacometrics: Evaluation of NONMEM Output Interpretation and Simulation Capabilities

Hwa Jun Cha
Kyuyeon Choe
Euibeom Shin
Murali Ramanathan
Sungpil Han

Read the full article

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Advancements in large language models (LLMs) have suggested their potential utility for diverse pharmacometrics tasks. This study investigated the performance of LLM for generating structure diagrams, publication-ready tables, analysis reports, and conducting simulations using output files from pharmacometrics models. Forty-four NONMEM output files were obtained from the GitHub software repository. The performance of Claude 3.5 Sonnet (Claude) and ChatGPT 4o was compared with two other candidate LLMs: Gemini 1.5 Pro and Llama 3.2. Prompt engineering was conducted for Claude for pharmacometrics tasks such as generating model structure diagrams, parameter tables, and analysis reports. Simulations were conducted using ChatGPT. Claude Artifacts was used to visualize model structure diagrams, parameter tables, and analysis reports. A Shiny R application was implemented. Claude was selected for investigation following performance comparisons with ChatGPT 4o, Gemini 1.5 Pro, and Llama on model structure diagram and parameter table generation tasks. Claude successfully generated the model structure diagrams for 40 (90.9%) of the 44 NONMEM output files with the initial prompts, and the remaining were resolved with an additional prompt. Claude consistently generated accurate parameter summary tables and succinct model analysis reports. Modest variability in model structure diagrams generated for replicate prompts was identified. ChatGPT demonstrated simulation capabilities but revealed limitations with complex PK/PD models. LLMs have the potential to enhance key pharmacometrics modeling tasks. However, expert review of the results generated is essential.

Version published to 10.21203/rs.3.rs-6470605/v1 on Research Square
Apr 28, 2025

Benchmarking Large Language Models for Replication of Guideline-Based PGx Recommendations

This article has 7 authors:
1. Mike Zack
2. Ioan Skobodchikov
3. Danil Stupichev
4. Alex Moore
5. David Sokolov
6. Igor Trifonov
7. Allan Gobbs
This article has no evaluationsLatest version May 15, 2025
A Systematic Review of Large Language Models in Medical Specialties: Applications, Challenges and Future Directions

This article has 7 authors:
1. Asma Musabah Alkalbani
2. Ahmed Salim Alrawahi
3. Ahmad Salah
4. Venus Haghighi
5. Yang Zhang
6. Salam Alkindi
7. Quan Z Sheng
This article has no evaluationsLatest version Apr 16, 2025
AlphaFold3 at CASP16

This article has 1 author:
1. Arne Elofsson
This article has no evaluationsLatest version Apr 16, 2025

Listed in

Abstract

Article activity feed

Related articles

Benchmarking Large Language Models for Replication of Guideline-Based PGx Recommendations

A Systematic Review of Large Language Models in Medical Specialties: Applications, Challenges and Future Directions

AlphaFold3 at CASP16