Information-Constrained Retrieval for Scientific Literature via Large Language Model Agents
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
This study addresses the problems of insufficient semantic understanding, interference from redundant information, and limited reliability of results in scientific literature retrieval by proposing a retrieval algorithm based on information constraints and large language model agents. The method introduces semantic encoding and similarity matching to achieve deeper associations between queries and candidate documents, while multi-dimensional information constraints regulate the retrieval process to balance relevance, coverage, and logical consistency. The agent module performs interactive reasoning and dynamic ranking, refining preliminary results through constraint optimization to improve accuracy and robustness. The dataset consists of large-scale scientific literature covering multiple disciplines, reflecting the complexity and diversity of real academic environments. Evaluation with Precision@k, Recall@k, MRR, and NDCG@k demonstrates superior performance, especially in accuracy and ranking quality. Sensitivity experiments on hyperparameters, environments, and data further confirm the stability and applicability of the method. The findings show that integrating information constraints with large language model agents effectively reduces redundancy and noise in retrieval and sustains high performance in complex academic contexts, providing reliable support for scientific research.