A Data-Driven Cognitive Feature–Based Model for English Text Readability Assessment to Support College English Instruction

Jing Zhao
Congrong Zou

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Text readability assessment is essential for effective college English teaching and instructional material selection. Traditional readability models mainly rely on surface linguistic features and often fail to reflect the cognitive processes involved in reading comprehension. To address this limitation, this study proposes a cognitive feature–based approach for English text readability assessment, aiming to support data-driven English teaching evaluation. The proposed method incorporates cognitive features related to lexical rarity, logical complexity, and comprehension difficulty, and integrates them with conventional linguistic features. Multiple machine learning models are employed and evaluated on four benchmark datasets, including CEFR, CLEC, OneStopEnglish, and RACE. Experimental results show that models combining cognitive and linguistic features consistently outperform those using linguistic features alone across multiple evaluation metrics. The findings indicate that cognitive features provide complementary information for readability assessment and enhance the discriminability of readability levels. This study offers practical implications for college English teaching by enabling more accurate matching between reading materials and learners’ proficiency levels, thereby supporting personalized and data-driven instructional decision-making.

Version published to 10.21203/rs.3.rs-8598697/v1 on Research Square
Mar 2, 2026

Predicting Item Difficulties in C-Tests Using Linguistic Features and Transformer-Based Language Models

This article has 5 authors:
1. Priscilla Achaa-Amankwaa
2. Björn Erik Hommel
3. Alexander Robitzsch
4. Stefan Schipolowski
5. Ulrich Schroeders
This article has no evaluationsLatest version Mar 20, 2026
Standardized Assessment of LLM English Proficiency

This article has 7 authors:
1. Shangchao Min
2. Shaonan Wang
3. Xinyu Gao
4. Hui Wang
5. Zhiling Jin
6. Chen Ling
7. Nai Ding
This article has no evaluationsLatest version Feb 19, 2026
Cognitive Limits and Focused Grammatical Structures in English and Arabic

This article has 1 author:
1. Amjed Bashar
This article has no evaluationsLatest version Mar 8, 2026

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Predicting Item Difficulties in C-Tests Using Linguistic Features and Transformer-Based Language Models

Standardized Assessment of LLM English Proficiency

Cognitive Limits and Focused Grammatical Structures in English and Arabic