Synergistic Cloud-Edge Intelligence for Real-time Multimodal Entity Linking and Knowledge Retrieval

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

The synergy between powerful, cloud-hosted large models and efficient, on-device small models presents a new frontier for mobile artificial intelligence. However, enabling real-time, knowledge-intensive interactions, such as querying physical objects in the user's environment, is severely hampered by the high latency of cloud-centric approaches and the limited reasoning capabilities of edge-only models. This paper introduces a novel synergistic cloud-edge intelligence framework designed to bridge this critical gap. Our framework implements a semantic partitioning of labor: a lightweight edge client performs real-time perception to generate compact multimodal features, while an intelligent trigger decides when to offload a structured, minimal-data payload to a powerful cloud-based reasoning engine for definitive entity linking and knowledge-grounded question answering. We validate our approach on SynergyQA, a new benchmark for this task. Experimental results show that our framework reduces end-to-end latency by over 80\% and data transmission by over 98\% compared to naive offloading strategies, while achieving question-answering performance that is competitive with state-of-the-art cloud-only models. Our work provides an efficient and robust blueprint for deploying the next generation of interactive, context-aware AI on mobile and edge devices.

Article activity feed