A Retrieval Model with Contextual Correlation Analysis for Verbose Queries

Dipannita Podder
Jiaul H. Paik
Pabitra Mitra

Read the full article

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Retrieving relevant documents using verbose queries is a key challenge in information retrieval, as such queries often include extraneous terms. Traditional retrieval models treat all query terms equally, which limits their effectiveness. Existing methods for verbose queries are typically supervised or rely on costly two-stage ranking pipelines.We propose a fully unsupervised, single-phase retrieval model that estimates the centrality of each query term by analyzing its contextual correlation with the entire query. A fully connected term graph is constructed, where edge weights capture the relative correlation of each term with the query context compared to others. Centrality scores are computed via power iteration over this graph. Dense representations of query terms and context are obtained using a pre-trained Bidirectional Encoder Representations from Transformers (BERT) model.To further reduce the influence of non-informative document terms, an additional weight based on term information content is introduced. These two weights are combined and integrated into a modified Markov Random Field Sequential Dependence Model (SDM) for ranking.Experiments show that our model outperforms unsupervised baselines, performs comparably to supervised baselines, and surpasses several neural rankers in zero-shot settings. Comparable results with both GloVe and BERT embeddings highlight its embedding independence nature. The model shows larger gains on longer queries, modest improvements on shorter ones, but never underperforms SDM.Therefore, the model’s independence from relevance judgments and top-ranked documents, along with its consistent, embedding-agnostic performance across query lengths, makes it well-suited for low-resource scenarios.

Version published to 10.21203/rs.3.rs-6970571/v1 on Research Square
Jul 10, 2025

Information-Optimized and Adaptive Document Segmentation for Multilingual Knowledge Graphs

This article has 3 authors:
1. Diqi Si
2. Yuwen Wei
3. Leiwu Wen
This article has no evaluationsLatest version Jun 6, 2025
Bag-of-Frames: Improving Bag-of-Words for a better similarity measure

This article has 1 author:
1. Abel Browarnik
This article has no evaluationsLatest version Jun 3, 2025
Fusion-Based Retrieval-Augmented Generation for Complex Question Answering with LLMs

This article has 6 authors:
1. Yumeng Sun
2. Renhan Zhang
3. Renzi Meng
4. Lian Lian
5. Heyi Wang
6. Xuehui Quan
This article has no evaluationsLatest version Jul 9, 2025

Listed in

Abstract

Article activity feed

Related articles

Information-Optimized and Adaptive Document Segmentation for Multilingual Knowledge Graphs

Bag-of-Frames: Improving Bag-of-Words for a better similarity measure

Fusion-Based Retrieval-Augmented Generation for Complex Question Answering with LLMs