Can AI Lend a Hand? Integrating Artificial Intelligence into Education Systematic Reviews and Meta-Analyses
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Systematic Review and Meta-analysis are resource-intensive, particularly during full-text screening and the coding of study characteristics. Recent advances in artificial intelligence (AI), including large language models, offer opportunities to improve the efficiency and scalability of these phases. This study evaluates the use of ChatGPT-4 to support screening and coding in a large-scale education meta-analysis of remote learning programs. Using 170 full-text articles for screening and 60 studies for coding, we compared AI outputs to human-coded results. In screening, ChatGPT achieved consistently high recall (≈ 0.89) and moderate-to-strong precision (≈ 0.70), suggesting that AI can effectively serve as a first-pass screener when paired with human validation. In coding, ChatGPT reached an overall accuracy of 86%, performing well on structured descriptors such as study design and sample demographics but lagging behind humans on nuanced judgments such as intervention details and sample characteristics. Based on these findings, we propose practical recommendations for hybrid AI-human workflows: using AI for repetitive, structured tasks; reserving complex interpretive work for human coders; and embedding feedback loops to improve reliability over time. Overall, the findings suggest that AI can meaningfully complement human judgment in education meta-analyses, reducing workload while maintaining rigor.