Emulation of placebo-controlled index trials using observational data with cloning, censoring and weighting: Empirical assessment of constraints and credibility

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Objective

Target trial emulation (TTE) has become a prominent approach to conducting observational effectiveness studies, yet limited attention has been paid to the nuances of emulating placebo-controlled trials in this framework using claims data. As a demonstration, we aimed to expand evidence generated by the TOPCAT trial comparing spironolactone versus placebo in patients with heart failure with preserved ejection fraction (HFpEF) to the U.S. HFpEF population.

Methods and Analysis

We estimated the observational analogue of the per-protocol effect for spironolactone initiation and continued use versus non-initiation in 2012-2020 Medicare claims with the clone-censor-weight approach. We evaluated two composite effectiveness endpoints of heart failure hospitalization (HHF) and cardiac arrest with either all-cause or cardiovascular mortality, respectively, as well as each component except cardiac arrest as an individual endpoint. Anticipating threats to validity through residual confounding, we pre-specified two guardrails: 1) benchmarking against results from TOPCAT Americas, and 2) evaluation of non-cardiovascular mortality as negative control outcome to quantify and correct for the magnitude of residual bias. To demonstrate investigator-induced biases frequently seen in studies not using the TTE framework, we additionally implemented a ‘naïve’ ever- vs never-user comparison that misclassified immortal person-time before spironolactone initiation as exposed.

Results

We included 320,881 patients with HFpEF in the overall Medicare cohort (mean age 80.6 years (SD 8.37); female 62%), of which 49,729 qualified for benchmarking against TOPCAT. In the benchmarking cohort, relative risks with spironolactone use compared to non-use for effectiveness outcomes ranged between 0.97 (95%-CI = [0.94; 1.01]) for the composite with cardiovascular death and 1.14 (95%-CI = [1.11; 1.18]) for all-cause mortality. The negative control of non-cardiovascular mortality suggested presence of residual confounding. After bias correction, our relative risks were in line with TOPCAT hazard ratios for HHF-driven outcomes (e.g. composite with cardiovascular death 0.88 (95%-CI = [0.85; 0.91]) in our study vs. 0.82 (95%-CI = [0.69; 0.98]) in TOPCAT), but not for mortality outcomes (e.g. all-cause death 1.04 (95%-CI = [1.01; 1.07]) vs. 0.83 (95%-CI = [0.68; 1.02]) in TOPCAT). Estimates in the overall cohort were comparable to the benchmarking cohort. The naïve analysis of ever versus never-use produced substantially biased results (e.g. 1.22 (95%-CI = [1.13; 1.30], composite with cardiovascular death) to 0.58 (95%-CI = [0.53; 0.65], all-cause death, benchmarking cohort).

Conclusion

In emulations of placebo-controlled trials, residual confounding remains a persistent threat and it is critical to build in pre-specified guardrails to detect and address this bias.

Key messages

  • What is already known on this topic – Target trial emulation presents a principled framework of designing observational studies, and within this framework, the clone-censor-weight approach has been recommended to avoid immortal time bias when emulating placebo-controlled trials.

  • What this study adds – Even after fully avoiding immortal time through the clone-censor-weight approach within the target trial framework, observational studies of non-use comparisons remain prone to other sources of bias. Bias analysis and benchmarking can help gauge the extent and direction of such bias.

  • How this study might affect research, practice or policy – This study showcases how researchers can leverage pre-specified benchmarking and net bias analysis as guardrails when using the clone-censor-weight design for non-use-comparisons to ensure accurate interpretation. It also provides auxiliary evidence on the effects of spironolactone in HFpEF for the Medicare population beyond TOPCAT that may inform clinical decision-making.

Article activity feed