Enabling Safe UAV Navigation in Transparent and Specular Environments via Generative Depth Completion
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Reliable perception and autonomous navigation are critical capabilities for unmanned aerial vehicles (UAVs). Modern architecture is dominated by transparent and specular surfaces (TSS), yet these materials present a perceptual vacuum for conventional UAV sensing, leading to frequent collisions. In this work, we present a solution that enables UAVs to mimic the human ability to infer TSS geometry using only conventional sensors. We introduce a unified navigation framework centered on a geometry-guided diffusion-based depth completion model. By injecting sparse LiDAR measurements as explicit geometric constraints into the diffusion process, we resolve scale inconsistency and improve accuracy. To enable real-time performance, we utilize a single-step inference strategy derived from diffusion theory, bypassing iterative denoising to achieve high-speed depth generation on resource-constrained platforms. Furthermore, we introduce a cross-modal fusion mapping algorithm that fuses generative depth with LiDAR data, preventing the loss of critical obstacle cues. We validate our framework through extensive real-world flight experiments across diverse indoor, outdoor, and nighttime settings. Our approach outperforms state-of-the-art methods in depth completion and mapping, effectively bridging the TSS-blindness gap in robotics and extending the operational scope of autonomous UAVs in complex human-centric environments.