BanglaOCT2025: A Population-Specific Fovea-Centric OCT Dataset with Self-Supervised Volumetric Restoration Using Flip-Flop Swin Transformers
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Background: Age-related macular degeneration (AMD) is a major cause of vision loss, yet publicly available OCT datasets lack demographic diversity, particularly from South Asian populations. Existing datasets largely represent Western cohorts, limiting AI generalizability. Moreover, raw OCT volumes contain redundant spatial information and speckle noise, hindering efficient analysis. Methods: We introduce BanglaOCT2025, a retrospective dataset collected from the National Institute of Ophthalmology and Hospital (NIOH), Bangladesh, using Nidek RS-330 Duo 2 and RS-3000 Advance systems. We propose a novel preprocessing pipeline comprising two stages: (1) A "Constraint-Based Centroid Minimization" algorithm that autonomously identifies the foveal center to extract a 33-slice region of interest, robust against retinal tilt; and (2) A Self-Supervised Volumetric Restoration module utilizing a Flip-Flop Swin Transformer (FFSwin) backbone to mitigate speckle noise without requiring paired clean data. Results: The dataset includes 1,585 OCT scans (202,880 B-scans), with 857 expert-annotated cases (54 DryAMD, 61 WetAMD, and 742 NonAMD). The proposed restoration pipeline enhances image clarity while preserving pathological biomarkers, improving classification accuracy from 69.08% to 99.88% using the same classifier. Statistical significance was confirmed via paired McNemar testing, and denoising quality was evaluated using reference-free volumetric metrics. Conclusions: BanglaOCT2025 constitutes the first clinically validated OCT dataset representing the Bengali population and establishes a reproducible, fovea-centric preprocessing and restoration paradigm for robust AMD analysis in resource-constrained clinical settings.