PED-X-Bench: A Benchmark of Adult-to-Pediatric Extrapolation Decisions in FDA Drug Labels

Apoorva Srinivasan
Jacob Berkowitz
Nadine A. Friedrich
Kevin Tsang
Aditi Kuchi
José Acitores
Michael Zietz
Ryan S. Czarny
Hongyu Liu
Nicholas P. Tatonetti

Read the full article

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Pediatric trials are ethically and logistically difficult, so the U.S. FDA often extrapolates adult data to children when justified. Yet no public resource systematically documents these decisions. We present PED-X-Bench , the first dataset and benchmark that encodes FDA pediatric-extrapolation outcomes as a four-way classification task ( Full, Partial, None, Unlabeled ). PED-X-Bench contains 737 FDA drug-label sections (≈ 1 M words of source text) for approvals issued 2007–2024 across all therapeutic areas. A two-stage o3-mini prompting pipeline mined full FDA label text; nine domain reviewers then adjudicated a stratified sample of 135 labels yielding an accuracy F1 of 0.74 and 0.63 respectively (inter-annotator κ = 0.678) and spot-checking the remainder. For every drug we release the ground-truth label, concise efficacy and pharmacokinetic/safety summaries, and harmonized study metadata. To showcase utility we release two baseline models: (i) a logistic-regression classifier that uses structured metadata from FDA’s pediatric-labeling dataset, and (ii) a fine-tuned BigBird BERT that ingests full label text. Both base-lines perform modestly, leaving ample headroom for future work. PED-X-Bench enables research on pediatric drug development, clinical NLP and drug safety; dataset card and code are made available here: github.com/tatonetti-lab/PedXBench huggingface.co/datasets/apoorvasrinivasan/Ped-X-Bench

Version published to 10.1101/2025.05.22.25328187v1 on medRxiv
May 23, 2025

A Systematic Review of Large Language Models in Medical Specialties: Applications, Challenges and Future Directions

This article has 7 authors:
1. Asma Musabah Alkalbani
2. Ahmed Salim Alrawahi
3. Ahmad Salah
4. Venus Haghighi
5. Yang Zhang
6. Salam Alkindi
7. Quan Z Sheng
This article has no evaluationsLatest version Apr 16, 2025
A Single-Cell Atlas Of Human Pediatric Liver Reveals Age-Related Hepatic Gene Signatures

This article has 18 authors:
1. Rachel D Edgar
2. Diana Nakib
3. Damra Camat
4. Sai Chung
5. Patricia Lumanto
6. Jawairia Atif
7. Catia T. Perciani
8. Xue-Zhong Ma
9. Cornelia Thoeni
10. Nilosa Selvakumaran
11. Justin Manuel
12. Blayne Sayed
13. Koen Huysentruyt
14. Amanda Ricciuto
15. Ian McGilvray
16. Yaron Avitzur
17. Gary D Bader
18. Sonya A MacParland
This article has no evaluationsLatest version Apr 25, 2025
Comparative Evaluation the Knowledge of Large Language Models about Response Evaluation Criteria in Solid Tumors?

This article has 3 authors:
1. Eren Çamur
2. Turay Cesur
3. Yasin Celal Güneş
This article has no evaluationsLatest version May 7, 2025

Listed in

Abstract

Article activity feed

Related articles

A Systematic Review of Large Language Models in Medical Specialties: Applications, Challenges and Future Directions

A Single-Cell Atlas Of Human Pediatric Liver Reveals Age-Related Hepatic Gene Signatures

Comparative Evaluation the Knowledge of Large Language Models about Response Evaluation Criteria in Solid Tumors?