OrthoKnow-SP: A Large-Scale Dataset on Orthographic Knowledge and Spelling Decisions in Spanish Adults

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Orthographic knowledge is a critical component of skilled language use, yet its large-scale behavioral signatures remain understudied in Spanish. To address this gap, we developed OrthoKnow-SP, a megastudy that captures spelling decisions from 27,185 native Spanish-speaking adults who completed an 80-item forced-choice task. Each trial required selecting the correctly spelled word from a pair comprising a real word and a pseudohomophone foil that preserved pronunciation while violating the correct graphemic representation. The stimuli targeted six high-confusability contrasts in Spanish orthography. We recorded response accuracy and reaction times for over 2.17 million trials, alongside demographic and device metadata. Results show robust variability across items and individuals, with item-level metrics closely aligned with independent norms of word prevalence. A composite difficulty index integrating speed and accuracy further allowed fine-grained item ranking. The dataset provides the first population-scale norms of Spanish spelling difficulty, capturing regional and generational diversity absent from traditional lab-based studies. Public release of OrthoKnow-SP enables new research on the cognitive and demographic factors shaping orthographic decisions, and provides educators, clinicians, and developers with a valuable benchmark for assessing spelling competence and modeling written language processing.

Article activity feed