Validation of a Composite Mortality Endpoint in a Large, Clinico-Genomic Real-World Database of Patients with Advanced Cancer

Joshuah Kapilivsky
Farahnaz Islam
Emma K. Roth
Jessica Dow
Shannon Moran
Emilie Scherrer
Seung Won Hyun
Chithra Sangli

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Purpose

Real-world data (RWD) from electronic health records (EHRs) and next-generation sequencing are increasingly used to study treatment effectiveness in molecularly refined patient populations. Incomplete mortality data in EHR can overestimate survival rates in RWD studies. While the National Death Index (NDI) is the gold standard for mortality data in the United States, its limited accessibility and reporting delays hinder timely research. Instead, EHR datasets are often supplemented with external mortality data sources to improve mortality data capture. This study evaluated a composite mortality variable against NDI records using a large cohort of advanced cancer patients from a real-world oncology database.

Methods

De-identified clinical and molecular data from patients with advanced solid tumors were linked with third-party mortality and claims datasets using deterministic tokenization. Vital status and death dates were harmonized across sources. Patient identifiers were submitted to NDI, and true matches were de-identified and joined for analysis. Performance metrics (sensitivity, specificity, positive predictive value [PPV], negative predictive value [NPV]) were calculated using NDI as ground truth. Date agreement was assessed at 0, ±15, and ±30-day tolerances. Subgroup analyses and a cumulative cases/dynamic controls (CC/DC) approach were also performed.

Results

Among 17,597 patients, the composite mortality variable demonstrated 82% sensitivity and 95% specificity against NDI. PPV was 96%, and NPV was 77%. Exact date agreement was 86%, increasing to 94% within a ±15-day tolerance and 96% within a ±30-day tolerance. Incorporating third-party mortality and claims data substantially improved sensitivity from 17% (EHR alone) to 82%. Sensitivity remained stable across subgroups but showed variation by age, cancer type, geographic region, and race. With the CC/DC approach, sensitivity was 96% at 6 months, 97% at 12 months, and 98% at 24 months, with specificity above 98% across these timeframes.

Conclusions

The composite mortality variable is a robust, reliable endpoint for real-world evidence analyses. Its high accuracy for identified deaths and appropriate censoring of lost-to-follow-up patients support its use in overall survival analyses. This validation is a foundational step towards high-quality research to improve patient outcomes and advance cancer drug development using this multimodal dataset.

Clinical trial number: not applicable

Version published to 10.1101/2025.08.20.25334011 on medRxiv
Aug 24, 2025

Assessing the Impact of Comprehensive Genomic Profiling on Therapeutic Selection for Advanced Solid Tumors in Portugal

This article has 23 authors:
1. Nuno Tavares
2. Pedro Simões
3. Raquel Lopes-Brás
4. Teresa R. Pacheco
5. Sara Damaso
6. Andre Mansinho
7. Leonor Abreu Ribeiro
8. Gonçalo Nogueira-Costa
9. Catarina Abreu
10. Tiago Barroso
11. Nuno Bonito
12. Rita Figueiró
13. Bogdana Darmits
14. Sara Loureiro Melo
15. Tania Rodrigues
16. Helena Guedes
17. Edgar Pratas
18. Diogo Alpuim Costa
19. Frederico Ferreira Filipe
20. Daniela Macedo
21. Ana Cavaco
22. Marina Pavanello
23. Luis Costa
This article has no evaluationsLatest version Jan 23, 2026
<i>Retrospective Cohort Study: </i>Predictors of One-Year Mortality in Hemodialysis Patients with End-Stage Renal Disease at a Kenyan County Hospital

This article has 13 authors:
1. Felix Pius Omullo
2. Thomas Kimanzi Kitheghe
3. Maureen Mueni Mark
4. Allan Kariuki Ng'a ng'a
5. Magdalene Wanjiru Parsimei
6. Wambugu Charles Kanyi
7. Ooko Anyang'o Emma
8. Ismail Abdi Sheikh
9. Joshua Macharia Gitimu
10. Abel Mwangi Gakuya
11. Glory Kawira Gitonga
12. John Alex Ndung'u
13. Elisheba Moraa Nyaro
This article has no evaluationsLatest version Jan 4, 2026
Impact of early palliative care on patients with advanced lung cancer: a retrospective real- world cohort study using the TriNetX network

This article has 7 authors:
1. Pi-Hua Chang
2. Ay-Line Ke
3. Wei-Min Chu
4. Hsin-Hua Chen
5. Chia-Hui Yu
6. Pin-Hua Lin
7. Mei-Yu Chang
This article has no evaluationsLatest version Dec 22, 2025

Discuss this preprint

Listed in

Abstract

Purpose

Methods

Results

Conclusions

Article activity feed

Related articles

Assessing the Impact of Comprehensive Genomic Profiling on Therapeutic Selection for Advanced Solid Tumors in Portugal

<i>Retrospective Cohort Study: </i>Predictors of One-Year Mortality in Hemodialysis Patients with End-Stage Renal Disease at a Kenyan County Hospital

Impact of early palliative care on patients with advanced lung cancer: a retrospective real- world cohort study using the TriNetX network