The Comparability of Manual vs. Algorithm-Based Calculation of Clinical Trial Methodological Quality Indices
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Background Evidence hierarchies guide evidence-based practice by ranking forms of evidence to support translation and clinical decision-making. Systematic reviews and meta-analyses (SRMAs) represent the highest form of evidence but are time and resource-intensive, contributing to the estimated 17-year lag in the translation of evidence into practice. Tools that automate aspects of the systematic review processes aim to shorten this gap. Specifically, algorithm-based evaluation of study quality, as performed in the CogTale evidence synthesis platform, accelerates such processes relative to manual methods, leading to a more rapid synthesis of the evidence. In this study, we assessed the agreement between CogTale’s algorithm-based scoring of the PEDro and RoB scales with manual scoring of these scales. Methods We selected 37 randomised controlled trials (RCTs) with PEDro scores available on the NeuroBITE Platform and 37 trials with Risk of Bias (RoB) scores available in Cochrane meta-analyses. Agreement for individual PEDro and RoB items was evaluated using Gwet’s AC1, while a Bland-Altman plot assessed total PEDro score agreement. Results The Bland-Altman analysis showed an average difference in PEDro scores of 0.92 between CogTale and NeuroBITE, with limits of agreement from − 2.09 to 3.93. Gwet's AC1 revealed almost perfect agreement for PEDro items P1, P2, and P11; substantial agreement for P5, P6, P7, and P10; moderate agreement for P3 and P4; slight agreement for P8; and poor agreement for P9. For RoB domains, substantial agreement was found for random sequence generation, allocation concealment, and detection bias, with fair agreement in other domains. Conclusions Overall, CogTale's algorithm-based PEDro and RoB scores align well with manual scores, despite some discrepancies in specific PEDro (P3, P8, P9) and RoB items, likely due to systematic scoring criteria variations. CogTale shows promise in automating quality assessments, potentially reducing time for evidence synthesis while maintaining accuracy. Future research should address key limitations by examining how scoring differences impact meta-analytic outcomes and evaluate CogTale's performance with larger datasets as more evidence accumulates on the platform.