Modeling strategies for a flexible estimation of the crude cumulative incidence in the context of long follow-ups: model choice and predictive ability evaluation

Giacomo Biganzoli
Giuseppe Marano
Patrizia Boracchi

Read the full article

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Background Advancements in treatments for chronic diseases, such as breast cancer, have expanded our ability to observe patient outcomes beyond disease-related mortality, including events like distant recurrences. However, competing events can cloud the interpretation of primary outcomes, making the crude cumulative incidence function the only reliable measure for accurate follow-up analysis. Long-term studies call for flexible modeling to accommodate intricate, time-dependent effects and interactions among covariates—something traditional models like the proportional sub-distribution hazards model often struggle to address. While more adaptable methods have been proposed, the need remains to systematically assess model complexity, especially for exploratory purposes. This article presents a statistical learning workflow designed to evaluate model complexity in crude cumulative incidence, also introducing a time-dependent metric for predictive accuracy. This framework offers researchers an enhanced toolkit for tackling robustly the complexities of long-term outcome modeling. Methods Our approach is showed using data on time-to-distant breast cancer recurrences from the Milan 1 and Milan 3 trials, both with extensive follow-up periods. We employ two flexible modeling frameworks—pseudo-observations and sub-distribution hazard models—enhanced with spline functions to capture baseline hazard and risk. The proposed workflow integrates graphical representations of Aalen-Johansen estimates for crude cumulative incidence to visually hypothesize and adjust model complexity to match the studied phenomenon. Information criteria guide model selection to approximate the underlying data structure. Using bootstrapped data perturbations and time-dependent predictive accuracy measures, adjusted with Harrell’s optimism correction, we identify the optimal model structure, balancing explainability, predictivity, and generalizability. Results Our findings emphasize the importance of data perturbation and validation through optimism-corrected predictive measures following the original data analysis. The initial model structure might differ from the most robust model identified through iterative perturbation. The ideal model is one with high robustness (most frequently selected in perturbations) alongside strong explainability and predictive capacity. When perturbation results are inconsistent, evaluating various time-dependent predictive measures offers additional insights, particularly regarding any trade-off between model complexity and predictive gains. In cases where predictive improvement is minimal, simpler, more explainable model structures are preferable. Conclusions The proposed statistical learning workflow, informed by domain expertise, allows for the incorporation of clinically relevant complexities in prognostic modeling. Our results suggest that, in many cases, embracing a nuanced, flexible model structure may better serve future predictions than opting for simpler models. This approach demonstrates the value of a balance between model simplicity and complexity to achieve meaningful, clinically useful insights.

Version published to 10.21203/rs.3.rs-5396254/v1 on Research Square
Dec 9, 2024

Adapting models with single time-to-event outcomes to include a competing outcome: an exemplar adjusting risk of recurrence after nephrectomy for clear cell renal cell carcinoma for death from other causes

This article has 5 authors:
1. Georgia Stimpson
2. Juliet A Usher Smith
3. Grant D Stewart
4. Paul Pharoah
5. Hannah Harrison
This article has no evaluationsLatest version May 22, 2025
Development and Validation of Time-Dependent Risk Prediction Models for the Incidence and Progression of Chronic Kidney Disease in Individuals with Type 2 Diabetes Mellitus

This article has 13 authors:
1. Yubo Zhao
2. Shuya Lu
3. Jiqiao Lu
4. Lin Yang
5. Cheuk Wai Lo
6. Man Kin Wong
7. Ting Li
8. Ren Hui
9. Xiang Li
10. Lin Xu
11. Jun Liang
12. Daihai He
13. David H.K. Shum
This article has no evaluationsLatest version May 19, 2025
Discrete Event Simulation: A Flexible Framework for Cost-Effectiveness Analysis in Health Care (Motivated by the Evaluation of Gynaecological Cancer Surveillance Strategies for Women with Lynch Syndrome by Snowsill et al.)

This article has 8 authors:
1. Ricardo Pietrobon
2. Aline Machiavelli
3. Luiza Paulsen Rodrigues
4. Amit Agrey
5. Lizzy Nkeangnyi
6. Giselle Zechia
7. Victor Galvão
8. Lucas Teixeira
This article has no evaluationsLatest version May 16, 2025

Listed in

Abstract

Article activity feed

Related articles

Adapting models with single time-to-event outcomes to include a competing outcome: an exemplar adjusting risk of recurrence after nephrectomy for clear cell renal cell carcinoma for death from other causes

Development and Validation of Time-Dependent Risk Prediction Models for the Incidence and Progression of Chronic Kidney Disease in Individuals with Type 2 Diabetes Mellitus

Discrete Event Simulation: A Flexible Framework for Cost-Effectiveness Analysis in Health Care (Motivated by the Evaluation of Gynaecological Cancer Surveillance Strategies for Women with Lynch Syndrome by Snowsill et al.)