Reverse Double-Dipping: When Data Dips You, Twice—Stimulus-Driven Information Leakage in Naturalistic Neuroimaging

Seung-Goo Kim

Read the full article

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

This article elucidates a methodological pitfall of cross-validation for evaluating predictive models applied to naturalistic neuroimaging data—namely, ‘reverse double-dipping’ (RDD). In a broader context, this problem is also known as ‘leakage in training examples’, which is difficult to detect in practice. RDD can occur when predictive modeling is applied to data from a conventional neuroscientific design, characterized by a limited set of stimuli repeated across trials and/or participants. It results in spurious predictive performances due to overfitting to repeated signals, even in the presence of independent noise. Through comprehensive simulations and real-world examples following theoretical formulation, the article underscores how such information leakage can occur and how severely it could compromise the results and conclusions when it is combined with widely spread informal reverse inference. The article concludes with practical recommendations for researchers to avoid RDD in their experiment design and analysis.

Version published to 10.1101/2025.04.01.646146v1 on bioRxiv
Apr 5, 2025

A Practical Guide to Identifying Robust Clusters in Neuroimaging Data

This article has 2 authors:
1. Johan Nakuci
2. Dobromir Rahnev
This article has no evaluationsLatest version Mar 14, 2025
How forgiving are M/EEG inverse solutions to noise level misspecification? An excursion into the BSI-Zoo

This article has 4 authors:
1. Anuja Negi
2. Stefan Haufe
3. Alexandre Gramfort
4. Ali Hashemi
This article has no evaluationsLatest version Mar 13, 2025
Clarifying the reliability paradox: poor test-retest reliability attenuates group differences

This article has 2 authors:
1. Povilas Karvelis
2. Andreea Oliviana Diaconescu
This article has no evaluationsLatest version Mar 30, 2025

Listed in

Abstract

Article activity feed

Related articles

A Practical Guide to Identifying Robust Clusters in Neuroimaging Data

How forgiving are M/EEG inverse solutions to noise level misspecification? An excursion into the BSI-Zoo

Clarifying the reliability paradox: poor test-retest reliability attenuates group differences