TriORU2-Net++: Attention-Guided Three-StageU2-Net++ for Light Field Occlusion Removal
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
We introduce TriORU2-Net++, a novel three-stage architecture designed to address the persistent challenge of occlusion removal in light-field (LF) images by leveraging adaptive attention-guided feature integration and progressive hierarchical reconstruction. Unlike existing methods that struggle to fully exploit spatial hierarchies and adaptively restore occluded regions across scales, our model incorporates a ResASPP-AttFPN feature extractor, which integrates Residual Atrous Spatial Pyramid Pooling (ResASPP) with a spatial attention-enhanced Feature Pyramid Network (AttFPN) to selectively fuse multiscale features while emphasizing salient spatial cues essential for occlusion localization. The core of our framework is a tri-stage U2-Net++ reconstruction module, which performs progressive restoration through three hierarchically connected encoder-decoder stages of decreasing depth (4-level, 3-level, and 2-level), each built on VGG-based blocks and dense skip connections to recover increasingly refined background content. To further enhance detail preservation and structural consistency, we introduce a residual feature refiner (RFR) that consolidates residual cues and sharpens the boundaries of objects. Extensive experiments show that our method outperforms state-of-the-art (SOTA) approaches in both quantitative metrics and visual quality.