Neural Steered Mixture of Experts for Medical Image Denoising, and Super-Resolution
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
In medical imaging (MI) analysis, achieving high-fidelity spatial resolution remains challenging due to extended acquisition durations and limited frame rates, constraining the quality and diagnostic value of clinical data. Super-resolution (SR) methodologies reconstruct high-resolution (HR) representations from low-resolution (LR) input, mitigating hardware and temporal constraints while enhancing interpretability and diagnostic reliability. Current single-image super-resolution (SISR) paradigms often rely on oversimplified noise assumptions, modeled as Additive White Gaussian Noise (AWGN), which fail to capture the complex and modality-dependent noise distributions inherent to clinical imaging scenarios. These limitations require more sophisticated SR frameworks capable of accurately representing non-stationary degradations and ensuring robust performance across imaging modalities. We introduce a neural parametric Steered Mixture of Experts (N-SMoE) framework that leverages a generative adversarial-based training paradigm and a Stochastic Degradation Model (SDM), applying diverse perturbations to downsampled inputs to approximate clinical conditions. This framework combines a novel encoder network with an implicit probabilistic SMoE decoder. The encoder utilizes a Laplacian resizer with bandpass filtering to capture local spatial information and employs multi-head attention to preserve high-frequency (HF) structural patterns while estimating latent representations of the input image. The imsplicit probabilistic gating mechanisms of the SMoE decoder, using two-dimensional edge-aware kernels, represent the signal of interest with continuous transitions, making this autoregressive approach more robust and effective for SR and denoising. The proposed N-SMoE framework not only provides interpretability for the learned representations but also demonstrates state-of-the-art (SOTA) performance in restoration tasks across multiple medical imaging datasets, achieving improved fidelity and perceptual metrics.