Semantic Similarity Relaxation and Approximation of Incomplete Queries Using LLMs Embedding to Topic Graph Mining

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

In recent years, the Resource Description Framework (RDF) has emerged as a pivotal technology for structuring and interlinking data on the web. RDF graphs typically have billions of labelled entities, and how to efficiently retrieve the needed information from an RDF KG for a given SPARQL query has recently drawn more attention. However, because RDF data is schema-free, it is very challenging for users to understand the underlying structure fully. Consequently, different graph fragments can represent the same information. Therefore, it is extremely challenging to create complex SPARQL queries that encompass all possible structures. Recently, researchers have started to use knowledge semantics to extend the query intention of a simplified query to get an approximate answer. In this paper, we present an efficient framework that allows access to the RDF repository even if users lack comprehensive knowledge of the underlying schema. Based on semantic similarity, we can get more answers that match the simple query. We propose a systematic method to mine RDF graphs into diverse semantically equivalent structure patterns (topic graphs). We use type similarity to construct these patterns, and then a large language model (LLM) embedding is adapted to these patterns to achieve semantic vectors of existing knowledge. Based on the knowledge semantics above, an approximate query is constructed to get the top-k semantic similarity result. Extensive testing using the DBpedia dataset and QALD-4 benchmark query has proven how effective and efficient our approach is.

Article activity feed