PRDE: Progressive Representation Reconstruction for Single Image Depth Estimation with Diffusion Priors and Detail-Amplification Decoding

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Single-image depth estimation (SIDE) is challenging, particularly in complex scenes that involve fine-grained structures, occlusions, and non-uniform textures. Although diffusion-based methods effectively model global semantic structures, their denoising process often suppresses high-frequency components, consequently resulting in the loss of local details and degraded edge quality. This paper proposes a progressive single-image depth estimation framework, termed PRDE (Progressive Representation Reconstruction with Diffusion Priors and Detail Enhancement), which integrates diffusion-generated global features with a dedicated Detail Feature Refinement Module (DFRM). The DFRM leverages frequency-domain attention and a representation alignment and integration module to bolster structural integrity and recapture local details. Experimental results on two standard benchmarks, NYU Depth v2 and KITTI, demonstrate that the proposed model outperforms existing models across multiple metrics. Notably, on the NYU dataset, our model achieves a 3.96%, 8.07%, and 2.33% improvement in Log10, SqRel, and RMSELog, respectively. Furthermore, it attains the best δ3 accuracy among all compared methods.

Article activity feed