When to Use Precision-Weighted Cross-Validation in Meta-Regression: A Simulation Study and Empirical Comparison for consideration as a Methodology article in Systematic Reviews

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Background: Cross-validation is regularly used for assessing predictive performance while doing meta-regression. There is however no consensus on which predictions should be evaluated using precision weights (inversely proportional to total study variance) or alternatively equal weights. This choice can affect cross-validated R² estimates and the also downstream inferences about model performance. Methods: We coded a leave-one-out cross-validation for meta-regression with both precision-weighted and unweighted evaluation metrics which usedmoment-based estimators for between-study heterogeneity. Performance was then evaluated across 32 simulated scenarios (300 replications each) which varied in sample size (k=10-40), the number of moderators (p=1-2), heterogeneity (τ²=0.05-0.20), true R²het (0%-25%), and the sampling variance structure (this was homogeneous gamma-distributed vs. heterogeneous with "anchor" studies then having extreme precision). We also tried analysing seven real meta-analytic datasets from education, medicine, and public health from R. Our diagnostic criteria were based on I² statistics and weight dispersion (coefficient of variation, Gini coefficient) which were developed to guide the method selection. Results: In our simulations with homogeneous sampling variances, weighted and unweighted cross-validation were closely in agreement(mean |ΔCV|=2.6 percentage points). Theheterogeneous variances including anchor studies, unweighted cross-validation then produced implausible estimates exceeding apparent R² by >20 percentage points in 81% of scenarios (vs. 0% for weighted). In our seven empirical datasets, the mean absolute difference resulted in 14.8pp (range: 0-81pp). In the datasets the Passive Smoking dataset showed the widest discrepancy: unweighted CV estimated 81.8% R² vs. 0.6% for weighted CV (apparent R² = 45.5%). The truncation of negative τ²CV estimates to zero occurred in 23% of weighted and 8% of unweighted scenarios. We recommended precision-weighted cross-validation for 6 of 7 datasets based on diagnostic criteria (I² > 50% or weight CV > 0.6). Conclusions: We suggest Precision-weighted cross-validation should be considered as an approach in meta-regression, especially when I² > 50% or weight dispersion is high (CV > 0.6). Unweighted cross-validation can often produce inflated estimates when sampling variances are heterogeneous, this could lead to in turn incorrect conclusions about predictive performance. The choice of evaluation metric may have important implications for model assessment in many meta-analysis.

Article activity feed