Ignoring multiplicity and clinical significance may undermine trial conclusions: Evidence from a re-analysis of the Grintuss® pediatric cough trial

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Background. Reporting and analysis standards for randomized trials are well established, yet evidence used to support pediatric cough remedies sometimes relies on incomplete reporting, extensive hypothesis testing, and interpretation dominated by p-values rather than effect sizes and clinical relevance. We use a pediatric cough trial of Grintuss® syrup as a methodological case study to assess how multiplicity and effect-size–focused reporting may alter trial conclusions. Methods. We conducted an audit of the trial’s design, analysis, reporting, and interpretation against established principles for randomized trials, focusing on prespecification, completeness of reporting, use of within-arm versus between-arm inference, multiplicity, baseline imbalance, and appropriateness of parametric tests for bounded ordinal outcomes. Because individual-level data were not available despite direct contact with the authors, we performed a constrained re-analysis using published figures. We reconstructed approximate means and standard deviations from plotted values and used sample sizes reported or implied in the article to derive approximate 95% confidence intervals for group means and day-4 between-arm differences. We enumerated the hypothesis tests reported or implied and applied Holm’s step-down procedure to adjust p-values for multiplicity. For a subgroup comparison on the proportion of improved cough, we computed an absolute risk difference with an approximate confidence interval. Results. The published report implies at least 15 non-independent hypothesis tests across outcomes, time windows, and subgroups, without prespecification or multiplicity adjustment. After multiplicity control, nominally significant findings most directly interpretable as evidence of efficacy did not remain below conventional thresholds. Reconstructed estimates suggested, at most, a small early advantage for night-time cough at day 4 with a confidence interval spanning no effect, and no clear day-time advantage at day 4; the report provided no clear between-arm signal at day 8. The subgroup analysis showed a large absolute difference but with wide uncertainty. Conclusions. In this pediatric cough trial, conclusions appear sensitive to selective p-value reporting, extensive multiplicity, and limited attention to effect size and clinical relevance. For pediatric cough trials and similar studies, robust inference requires prespecified estimands and endpoints, baseline-adjusted between-arm analyses with confidence intervals, explicit multiplicity strategies, and transparent reporting aligned with randomized trial standards.

Article activity feed