Comparative Evaluation of U-Net-Based Conditioned Diffusion Model and Cycle-GAN for Unpaired CT-MRI Brain Image Synthesis with XAI Validation
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Imaging pipelines in healthcare are often limited by reliance on a single imaging modality. Patients with metal implants or pacemakers cannot undergo MRI scans. Emergency stroke diagnostics typically depend solely on quick CT scans. Pediatric cases prefer lower radiation, ruling out multimodal imaging. In these situations, cross-modal image synthesis has become an appealing approach for generating one modality from another, particularly in brain imaging, such as converting CT to MRI and vice versa, where pairwise alignment poses challenges. The study evaluated two advanced models for unpaired CT and MRI brain image synthesis: the Conditioned diffusion model and Cycle-GAN, both built on the same U-Net architecture. Different training approaches were used—iterative denoising for the diffusion model and adversarial training for Cycle-GAN—to compare their effectiveness. Both models were trained for 2000 epochs and evaluated using task-specific metrics, including Fréchet Inception Distance, Inception Score, LPIPS, and the Dice index for tissue segmentation. The conditioned diffusion model consistently outperformed the adversarial model across all performance metrics, reducing the FID score by 39.5%, increasing the IS score by 19.0%, and enhancing anatomical fidelity. Explainability analyses revealed an over 18-fold increase in attention to relevant anatomical regions, with a 48% reduction in attention to less important areas. Radiologists confirmed that the diffusion model offered more realistic images, greater diagnostic confidence, and higher Turing test scores. Although computationally more intensive, the diffusion model demonstrated stronger alignment with actual anatomical features and medical standards.