Enhancing Cross-Modal Retrieval via Label Graph Optimization and Hybrid Loss Functions

Lin Wang
Chenchen Wang
Simin Peng

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Cross-modal retrieval, particularly image-text matching, is crucial in multimedia analysis and artificial intelligence, with applications in intelligent search and human-computer interaction. Current methods often overlook the rich semantic relationships between labels, leading to limited discriminability. We introduce a Two-Layer Graph Convolutional Network (L2-GCN) to model label correlations and a hybrid loss function, Circle-Soft, to enhance alignment and discriminability. Our approach, evaluated on NUS-WIDE, MIRFlickr, and MS-COCO datasets, achieves state-of-the-art performance, demonstrating its effectiveness and robustness. The source code is accessible via https://github.com/buzzcut619/L2-GCN-CIRCLE-SOFT

Version published to 10.21203/rs.3.rs-7931660/v1 on Research Square
Nov 11, 2025

HGACH: Hypergraph Attention Convolutional Hashing for Semi-supervised Cross-modal Retrieval

This article has 6 authors:
1. Fangming Zhong
2. Rui Zhang
3. Cun Zhu
4. Haiquan Yu
5. Chenglong Chu
6. Suhua Zhang
This article has no evaluationsLatest version Sep 24, 2025
Enhancing Multimodal Recommendation via Contrastive Self-Supervised Modality-Preserving Learning

This article has 2 authors:
1. Jiajie Lu
2. Yamashita Haruka
This article has no evaluationsLatest version Oct 27, 2025
Structure-Activated and Interest-Aware Multimodal Recommendation Method

This article has 3 authors:
1. HaoYu Wang
2. HongBin Xia
3. XiaoFeng Wang
This article has no evaluationsLatest version Oct 16, 2025

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

HGACH: Hypergraph Attention Convolutional Hashing for Semi-supervised Cross-modal Retrieval

Enhancing Multimodal Recommendation via Contrastive Self-Supervised Modality-Preserving Learning

Structure-Activated and Interest-Aware Multimodal Recommendation Method