VI-OCR: "Visually Impaired" Optical Character Recognition Pipeline for Text Accessibility Assessment

Qingying Gao
Roberto Manduchi
Pradeep Ramulu
Gordon Legge
Yingzi Xiong

Read the full article

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Low vision adversely impacts daily activities, particularly reading. However, quantifying text accessibility for different levels of low vision is challenging, leading to product designs that often overlook the vision status of low vision readers. In this paper, we bridge the gap between computer vision and low vision fields by introducing a text accessibility assessment pipeline called VI-OCR (short for Visually Impaired Optical Character Recognition), based on state-of-the-art OCR models. VI-OCR mimics human text recognition ability under specified levels of visual acuity and contrast sensitivity loss, to estimate whether text of a given size would be recognizable for a low vision human reader. We benchmarked specialized OCR models and vision-language models in replicating text recognition performances with visual acuity and contrast sensitivity deficits across three reading tasks: letter acuity using ETDRS charts, word acuity using MNREAD charts, and scene text recognition using complex real-life images. Comparing model performance to that of normal vision participants on degraded texts revealed major issues in some models including limited generalizability across reading tasks, difficulties dealing with severe contrast reduction, and overperforming rather than mimicking human observers. However, robust human-like performance of winning models such as Qwen and Gemini supports the feasibility of VI-OCR in assessing text accessibility.

Version published to 10.21203/rs.3.rs-7032700/v1 on Research Square
Jul 25, 2025

Digital Reader: A New Approach to Modern Reading for the Visually Impaired

This article has 1 author:
1. Huanting Guo
This article has no evaluationsLatest version Jul 10, 2025
Post-OCR Correction Using Large Language Models with Constrained Decoding

This article has 5 authors:
1. Ignacio Sastre
2. Lorena Etcheverry
3. Guillermo Rey
4. Guillermo Moncecchi
5. Aiala Rosá
This article has no evaluationsLatest version Jul 15, 2025
Distinct Roles of Central and Peripheral Vision in Rapid Scene Understanding

This article has 5 authors:
1. Byron A. Johnson
2. Ansh K. Soni
3. Shravan Murlidaran
4. Michael Beyeler
5. Miguel P. Eckstein
This article has no evaluationsLatest version Jul 24, 2025

Listed in

Abstract

Article activity feed

Related articles

Digital Reader: A New Approach to Modern Reading for the Visually Impaired

Post-OCR Correction Using Large Language Models with Constrained Decoding

Distinct Roles of Central and Peripheral Vision in Rapid Scene Understanding