The language of journal articles and its association with statistical significance
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Researchers face incentives to write up their empirical findings in a way that maximizes publication success. We analyze the language of journal articles and its association with statistical significance to explore questionable research practices at the stage of writing up articles, using 140,606 articles from health, biology, psychology, economics, and multidisciplinary journals over 32 years. For most disciplines, a higher share of non-significant main findings is associated with more hedging and negative striking words, fewer positive striking words, and fewer superlatives. We neither find evidence that authors upsell ambiguous results using sensational language nor that ambiguous results are written up less readably. Contrarily, articles with a higher share of statistically significant main findings are written up more sensationally. We find that emphasis on (marginal) statistical significance increases with the share of non-significant main findings, reflecting a dichotomized interpretation of p-values based on arbitrary thresholds. Particularly, p-excuses give the impression of statistical significance when the finding is actually not significant, consistent with the notion of 'spin'. This study provides empirical insights that might help researchers to self-reflect on writing up empirical findings. More training and fewer incentives to sell findings using sensational language and spin can help to improve academic writing.