Task Splitting and Prompt Engineering for Cypher Query Generation in Domain-Specific Knowledge Graphs

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

The integration of large language models (LLMs) with knowledge graphs (KGs) holds significant potential for simplifying the process of querying graph databases, especially for non-technical users. KGs provide a structured representation of domain-specific data, enabling rich and precise information retrieval. However, the complexity of graph query languages, such as Cypher, presents a barrier to their effective use by non-experts. This research addresses the challenge by proposing a novel approach, Prompt2Cypher (P2C), which leverages task splitting and prompt engineering to decompose user queries into manageable subtasks, enhancing LLMs’ ability to generate accurate Cypher queries that align with the underlying graph database schema. We demonstrate the effectiveness of P2C in two biological KGs (protein kinase and ion-channel) that differ in size, schema and complexity. Compared to a baseline approach, our method improves query accuracy, as demonstrated by higher Precision, Recall, F1-score, and Jaccard similarity metrics. This work contributes to the ongoing efforts to bridge the gap between domain-specific knowledge graphs and user-friendly graph database query interfaces.

Article activity feed