Jigsaw-Like Knowledge Graph Generation: A Study on Generalization Patterns with a LightRAG Implementation

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

The integration of knowledge graphs (KGs) with Retrieval-Augmented Generation (RAG) has significantly advanced domain-specific question-answering systems. However, a critical limitation persists in existing KG-based RAG frameworks: the inability to efficiently handle localized updates within a dynamic document corpus. Current methods typically necessitate a complete KG rebuild for even minor changes, leading to prohibitive computational costs of large language model (LLM) token consumption and significant KG generation time expenditure. To address this, we propose a novel jigsaw-like methodology from subgraphs to global KG generation and maintenance. Our approach leverages document lifecycle states (new, modified, persistent, deleted) to isolate and process only the 'delta changes' within the corpus. By decomposing the KG into document-level subgraphs, we enable token-efficient, localized updates where LLM extraction is invoked solely for altered documents, while reusing subgraphs from unchanged content. We engineer and evaluate Jigsaw-LightRAG, an extension of the vanilla LightRAG framework that implements this algorithm. Extensive experiments on public datasets demonstrate that this new framework reduces LLM token consumption by orders of magnitude during incremental updates while maintaining the structural integrity of the KG and achieving performance parity with full-rebuild baselines on question answering (QA) tasks. This work provides a computationally efficient and robust solution for dynamic AI knowledge base management, offering substantial practical value for applications requiring frequent KG updates.

Article activity feed