Generative Model-Based Fundus Photography Translation for Enhanced Cross-Device Consistency
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
We propose a novel image translation framework that converts fundus images from conventional fundus cameras to confocal scanning laser ophthalmoscopy (cSLO), aiming to bridge a clinically significant domain gap that has been largely overlooked. Our model incorporates self-attention modules to better capture long-range dependencies and jointly optimizes structural similarity and gradient variance losses to enhance anatomical fidelity and fine detail preservation. To support supervised training, we construct a high-quality paired dataset of camera and cSLO images collected from the same patients, with all pairs manually aligned and clinically verified to ensure diagnostic relevance. Experimental results demonstrate that our method achieves state-of-the-art performance in both perceptual realism and structural accuracy. Additionally, we introduce the Feature Matching Success Rate (FMSR), a novel keypoint-based metric using AKAZE descriptors, to quantitatively assess anatomical consistency across modalities.