HGACH: Hypergraph Attention Convolutional Hashing for Semi-supervised Cross-modal Retrieval

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Currently, the semi-supervised paradigm that integrates a few labeled and large numbers of unlabeled instances has garnered significant attention in cross-modal retrieval. This approach is particularly advantageous as it not only harnesses the supervisory signals from labeled data but also capitalizes on the latent information embedded in unlabeled samples. Nevertheless, previous works predominantly focus on pairwise relationships between instances, overlooking the higher-order relationships among samples, and consequently failing to fully exploit the underlying structure of the data. To bridge this gap, we propose a novel hypergraph attention convolutional hashing (HGACH) method for semi-supervised cross-modal retrieval. Distinguished from prior works, HGACH incorporates a dedicated mechanism to prioritize unlabeled samples, enabling the model to precisely capture complex higher-order dependencies via hypergraph attention convolutional networks. This approach ensures that the interdependencies between different modalities are better understood, thus significantly boosting retrieval accuracy. In addition, a robust similarity matrix is carefully designed, explicitly modeling both inter-modality and intra-modality distances, which is essential for correctly identifying similarities between different types of data. Furthermore, for labeled instances, we propose a supervised loss to preserve the similarity according to the given labels, ensuring that the model's predictions are consistent with the labeled data. Experimental results demonstrate that the proposed HGACH method outperforms existing state-of-the-art (SOTA) methods in terms of retrieval performance, showcasing its effectiveness in handling complex cross-modal retrieval tasks. The codes are available at https://anonymous.4open.science/r/HGACH-1D7F/.

Article activity feed