Zero-shot Object Visual Navigation Using Relation of Historical Objects with Target Transfer

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

In complex 3D indoor environments, accurately locating targets remains a significant challenge for autonomous navigation systems. Existing methods often struggle with zero-shot target generalization, limiting the agent's ability to perform long-term reasoning and semantic understanding for target localization. To address these challenges, we propose a novel framework, the Object Temporal Semantic Transfer Network (OTSNet). This framework consists of two core modules: the Object Temporal Semantic Module, which models the temporal relationships of objects over time using memory-stored context matrices and self-attention mechanisms, and the Graph Convolution Module, which captures spatial relationships between objects in the environment. Additionally, an Adaptive Weight Strategy Module is introduced, transferring learned relationships between object categories and scene contexts during training to unseen categories in zero-shot settings. By transferring knowledge between known target objects and unknown target objects, OTSNet enables zero-shot navigation, significantly enhancing the agent's long-term reasoning and semantic understanding in unseen scenes. This architecture allows for efficient target search without prior environmental knowledge. Experimental results demonstrate that OTSNet outperforms existing methods, achieving a success rate increase on the AI2-THOR platform.

Article activity feed