TriORU2-Net++: Attention-Guided Three-StageU2-Net++ for Light Field Occlusion Removal

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

We introduce TriORU2-Net++, a novel three-stage architecture designed to address the persistent challenge of occlusion removal in light-field (LF) images by leveraging adaptive attention-guided feature integration and progressive hierarchical reconstruction. Unlike existing methods that struggle to fully exploit spatial hierarchies and adaptively restore occluded regions across scales, our model incorporates a ResASPP-AttFPN feature extractor, which integrates Residual Atrous Spatial Pyramid Pooling (ResASPP) with a spatial attention-enhanced Feature Pyramid Network (AttFPN) to selectively fuse multiscale features while emphasizing salient spatial cues essential for occlusion localization. The core of our framework is a tri-stage U2-Net++ reconstruction module, which performs progressive restoration through three hierarchically connected encoder-decoder stages of decreasing depth (4-level, 3-level, and 2-level), each built on VGG-based blocks and dense skip connections to recover increasingly refined background content. To further enhance detail preservation and structural consistency, we introduce a residual feature refiner (RFR) that consolidates residual cues and sharpens the boundaries of objects. Extensive experiments show that our method outperforms state-of-the-art (SOTA) approaches in both quantitative metrics and visual quality.

Article activity feed