HyperTSRAG: Temporal-Spatial Reasoning over Hypergraph Knowledge Structures for Multimodal Retrieval-Augmented Generation
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Existing graph-based retrieval-augmented generation (RAG) systems represent knowledge with binary relations and rely primarily on semantic similarity for retrieval. This design struggles with multimodal queries requiring temporal constraints, spatial relationships, or higher-order interactions among entities. We present HyperTSRAG, a multimodal RAG retrieval algorithm that models knowledge as a hypergraph and performs explicit temporal-spatial reasoning during traversal. HyperTSRAG performs a bi-directional breadth-first traversal that alternates between entity nodes and hyperedge nodes to capture higher-order connectivity and ranks candidate evidence using a scoring function applied within traversal that integrates semantic similarity, temporal coherence, spatial overlap, and structural importance. We evaluate HyperTSRAG on a benchmark corpus of 1,000 multimodal documents (text, images, audio, and video) with 500 queries spanning simple lookups through complex multi-hop reasoning. For complex queries, HyperTSRAG achieves 78.3% Recall@10, improving by 12.4% over GraphRAG and 15.7% over LightRAG, while maintaining a 95th-percentile latency of 1.83 s. On temporal-spatial subsets, HyperTSRAG attains 85.2% accuracy on temporal queries and 81.6% on spatial queries. Ablation studies show that individual scoring components contribute 2–5% gains, while the hypergraph representation and traversal provide 13.2% improvement over a binary-graph equivalent. These results establish hypergraph-native traversal with temporal-spatial-aware scoring as an effective paradigm for multimodal RAG, particularly for constraint-driven queries that semantic-only retrieval cannot address.