3DeepVOG: An Open-Source Framework for Real-Time, Accurate 3D Gaze Tracking with Deep Learning

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Objective

Eye movements are key biomarkers for diagnosing and monitoring neuro-otological, neuro-ophthalmological and neurodegenerative disorders. Video-oculography (VOG) systems enable detection of small, rapid eye movements and subtle oculomotor pathologies that may be missed during clinical exams. However, they rely on high-quality input for accurate tracking, struggle with torsional movements, and are often limited by high costs in broader clinical and research settings.

Methods

To overcome these limitations, we developed 3DeepVOG, a deep learning-based framework for three-dimensional monocular gaze tracking (horizontal, vertical, and torsional rotation) designed to operate robustly across varied imaging conditions, including low-light and noisy environments. The method includes automated framewise segmentation of the pupil and iris from video frames, followed by geometrically interpretable gaze estimation based on a two-sphere anatomical eyeball model incorporating corneal refraction correction. Torsion is tracked in real time using a novel mini-patch template matching approach. The system was trained on over 24,000 annotated samples obtained across multiple devices and clinical scenarios. Application was tested against a gold-standard VOG system in healthy controls.

Results

3DeepVOG operates in real time (>300 fps) and achieves mean gaze errors of approximately 0.1° in all three motion dimensions. Derived oculomotor metrics – such as saccadic peak velocity, smooth pursuit gain, and optokinetic nystagmus slow-phase velocity – show good-to-excellent agreement with results from a clinical gold-standard system.

Conclusions

3DeepVOG enables accurate, quantitative eye movement tracking across three dimensions under diverse conditions. As an open-source framework, it provides an accessible and scalable tool for advancing research and clinical assessment in neurological oculomotor disorders.

Article activity feed