Metascientific Evaluation of Meta-Analyses in Thyroid Ultrasound: Evidence Quality, Guideline Integration, and Clinical Impact

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Introduction: Thyroid ultrasound serves as the cornerstone diagnostic modality for risk stratification and guiding fine-needle aspiration biopsy (FNAB) indications. Despite widespread adoption of Thyroid Imaging Reporting and Data Systems (TI-RADS), methodological inconsistencies and substantial variability persist across meta-analyses, undermining evidence comparability and robustness. Objective: To conduct a metascientific evaluation of meta-analyses on thyroid ultrasound, assessing inter-study agreement, methodological heterogeneity, and clinical implications of distinct classification systems. Methods: We identified 18 meta-analyses published between 2019 and 2025, encompassing 287 primary studies. Diagnostic performance metrics, including sensitivity, specificity, and likelihood ratios, FNAB rates, and statistical heterogeneity patterns were systematically evaluated. Concordance was specifically examined across major frameworks: ACR TI-RADS, EU-TIRADS, ATA guidelines, K-TIRADS, and emerging models. Results: Considerable variability in diagnostic accuracy was observed across meta-analyses. Weighted mean sensitivity across meta-analyses was 76.8% (95% CI: 75.4–78.2%), with specificity varying by system. ACR TI-RADS ACR TI-RADS demonstrated higher specificity and lower unnecessary FNAB rates (approximately 23%), whereas EU-TIRADS showed lower unnecessary FNAB rates (approximately 17%) but with variable specificity across studies. Statistical heterogeneity was pronounced ( up to 78.3%), reflecting divergent methodological approaches, population characteristics, and analytical thresholds. Conclusion: Findings reveal structural limitations in the current body of meta-analytic evidence and underscore the urgent need for methodological standardization and integration of multimodal strategies. Selection of a risk stratification system should balance diagnostic sensitivity against clinical impact, particularly in minimizing unnecessary invasive procedures.

Article activity feed