Calculating epidemiological outcomes from simulated longitudinal data
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Microsimulation models generate individual life trajectories that must be summarized as population-level outcomes for model calibration and validation. While there are established formulas to calculate outcomes such as prevalence, incidence, and lifetime risk from cross-sectional and short-term longitudinal studies, limited guidance exists to calculate these outcomes using long-term longitudinal data due to the rarity of large-scale studies covering events across the human lifespan. This technical report presents various methods to calculate epidemiological outcomes from simulated longitudinal data, from replicating a real-world study design to fully incorporating longitudinal disease and exposure durations. We provide an open-source code base with functions in R to calculate the prevalence, incidence, age-conditional risk, lifetime risk, and disease-specific mortality of a condition from individual-level time-to-event data. In addition, we provide guidance and code for calculating cancer-related outcomes from individual-level data, such as the stage distribution at diagnosis, the distribution of precancerous lesion multiplicity, and the mean dwell and sojourn time. Given the various possible formulations for certain outcomes, we call for increased transparency in reporting how summary outcomes are derived from microsimulation model outputs, and we anticipate that this report will facilitate the calculation of epidemiological outcomes in both simulated and real-world data.