Grade‑6 reading competence as a predictor of lower‑secondary achievement: A longitudinal single‑school study and an interpretable predictive model
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Background. Schools routinely assess reading at the end of primary, but evidence on how these scores translate into actionable predictions for lower-secondary achievement is scarce.Objective. We examine whether end-of-Grade-6 reading comprehension (RC) and reading speed (RS) predict subject grades in lower-secondary education (ESO) and we deliver an interpretable model usable by schools.Data and setting. Longitudinal records from one school in Catalonia (Spain). Standardized RC (ACL) and RS (Canals) were administered at the end of Grade 6; official ESO grades were collected for multiple subsequent years. The analytic sample comprised N=263 students with complete RC/RS and at least one ESO grade (eight subjects).Methods. Observational, retrospective study with 80/20 hold-out split by student and 5-fold cross-validation on the training set. We benchmarked multiple linear regression (RC+RS) against regularized regression, random forest, and a multilayer perceptron. Test metrics were MAE, RMSE, R², and the percentage of predictions within ±1 grade point.Results. RC was a robust, cross-subject predictor; RS added modest, subject-dependent information. On the held-out test set, averages across subjects were MAE ≈ 0.80, RMSE ≈ 0.99, R² ≈ 0.24, with ≈68% of predictions within ±1 point. Descriptively, ≈95% of students who failed at least one ESO subject belonged to low/medium RC bands at Grade 6, whereas none in the very-high RC band failed.Conclusions. A simple multiple-regression model using standardized Grade-6 RC (and RS) provides transparent, practically useful predictions that can support early identification and targeted support. Broader multi-site validation, addition of contextual covariates, and subgroup analyses are needed to increase explained variance and assess fairness.