Development and Internal Validation of a Clinical Cox Model for Predicting Overall Survival in Patients with Lung Cancer: Real-World Evidence from Seven Hospitals in China

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Background In routine practice, complete TNM staging is often unavailable, limiting the prognostic utility of staging alone. We investigated whether a pragmatic clinical-variable–only Cox model based on routinely collected data could provide robust short- to mid-term risk stratification for overall survival (OS) compared with a simple stage-only model using an ordinal three-level system (early/mid/late). Methods We undertook a multicentre retrospective cohort study across seven hospitals in China. Consecutive adults with pathologically confirmed lung cancer diagnosed between 2011 and 2023 were screened; after prespecified exclusions, 865 patients were included and split into training (n=584) and internal validation (n=281) cohorts. Median follow-up was 8 months (range, 1–114). Prespecified predictors were smoking status, white blood cell count (WBC), lymphocyte count (LYM), prothrombin activity (PTA), D-dimer level, receipt of chemotherapy in the initial treatment window, and age. The comparator was a staging-only Cox model (three-level staging). The primary outcome was OS (diagnosis to death; censored at last contact). Performance was evaluated using time-dependent AUCs at 6/12/18 months and the C-index; calibration intercept/slope and plots; decision-curve analysis (DCA); and categorical and continuous net reclassification improvement (NRI). Risk-stratified Kaplan–Meier curves assessed separation. Results Multivariable Cox regression showed independent associations with OS for smoking (HR 1.57, 95% CI 1.26–1.96), WBC (HR 1.015, 95% CI 1.000–1.030), LYM (HR 0.98, 95% CI 0.966–0.993), PTA (HR 0.986, 95% CI 0.981–0.992), D-dimer (HR 1.036, 95% CI 1.019–1.053), chemotherapy (HR 0.434, 95% CI 0.273–0.688), and age (HR 1.038, 95% CI 1.027–1.048). Discrimination at 6/12/18 months was acceptable to good (training AUCs 0.729/0.761/0.761; validation AUCs 0.804/0.789/0.803), with overall good calibration (close alignment with the ideal line in the 0.30–0.80 range). Versus the staging-only model, validation AUCs for the comparator were ~0.512/0.516/0.525, and the clinical model achieved greater net benefit across DCA thresholds ~0.10–0.80; time-dependent discrimination substantially favored the clinical model (ΔC, fit2−fit1 = −0.253; 95% CI −0.292 to −0.204; p=6.56×10⁻³⁰). Reclassification improved at each horizon (categorical NRI 0.839/0.473/0.473; continuous NRI 0.879/0.884/0.884). Kaplan–Meier curves showed clear, monotonic separation of low-, intermediate-, and high-risk groups. Conclusions In this seven-centre real-world cohort, a clinical variable–only Cox model built from routinely available data outperformed a staging-only approach for predicting 6–18-month OS, showing superior discrimination, acceptable calibration, greater net benefit, and substantial reclassification gains. These findings support the use of readily obtainable clinical data for short- to mid-term risk stratification and shared decision-making when detailed TNM information is scarce.

Article activity feed