A Hierarchical Deep Learning Architecture for Diagnosing Retinal Diseases Using Cross-Modal OCT to Fundus Translation in the Lack of Paired Data
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
The paper focuses on automated diagnosis of retinal diseases, particularly Age-related Macular Degeneration (AMD) and diabetic retinopathy (DR), using optical coherence tomography (OCT), while addressing three key challenges: disease comorbidity, severe class imbalance, and the lack of strictly paired OCT and fundus data. We propose a hierarchical modular deep learning system designed for multi-label OCT screening with conditional routing to specialized staging modules. To enable DR staging when fundus images are unavailable, we use cross-modal alignment between OCT and fundus representations. This approach involves training a latent bridge that projects OCT embeddings into the fundus feature space. We enhance clinical reliability through per-class threshold calibration and implement quality control checks for OCT-only DR staging. Experiments demonstrate robust multi-label performance (macro-F1 =0.989±0.006 after per-class threshold calibration) and reliable calibration (ECE =2.1±0.4%), and OCT-only DR staging is feasible in 96.1% of cases that meet the quality control criterion.