Synergistic Cloud-Edge Intelligence for Real-time Multimodal Entity Linking and Knowledge Retrieval
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
The synergy between powerful, cloud-hosted large models and efficient, on-device small models presents a new frontier for mobile artificial intelligence. However, enabling real-time, knowledge-intensive interactions, such as querying physical objects in the user's environment, is severely hampered by the high latency of cloud-centric approaches and the limited reasoning capabilities of edge-only models. This paper introduces a novel synergistic cloud-edge intelligence framework designed to bridge this critical gap. Our framework implements a semantic partitioning of labor: a lightweight edge client performs real-time perception to generate compact multimodal features, while an intelligent trigger decides when to offload a structured, minimal-data payload to a powerful cloud-based reasoning engine for definitive entity linking and knowledge-grounded question answering. We validate our approach on SynergyQA, a new benchmark for this task. Experimental results show that our framework reduces end-to-end latency by over 80\% and data transmission by over 98\% compared to naive offloading strategies, while achieving question-answering performance that is competitive with state-of-the-art cloud-only models. Our work provides an efficient and robust blueprint for deploying the next generation of interactive, context-aware AI on mobile and edge devices.