Visual Localisation Using Deep Learning and Graph Neural Networks: Approaches and Evaluation
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
This paper addresses the visual localization problem, focusing on estimating camera position and orientation from images in a known scene. Traditional localization methods utilizing local feature matching face challenges in generalization to new scenarios. In contrast, this study explores state-of-the-art techniques, including deep learning models and graph neural networks, to enhance feature extraction and matching. We implemented five models: SIFT, a CNN-based baseline, Hierarchical Localisation, an ImageSimilarity-Autoencoder, and the SuperGlue feature matching model. Evaluated on a dataset from the Getty Center in Los Angeles for a Kaggle competition, the SuperGlue model significantly outperformed others, achieving a mean absolute error (MAE) of 6.37266. The findings suggest that leveraging advanced architectures and attention mechanisms can substantially improve visual localization performance, even under challenging conditions. This research highlights the potential of integrating deep learning and graph neural networks in practical localization tasks