Matching and Rewriting Rules in Object-Oriented Databases

Giacomo Bergami
Oliver Robert Fox
Graham Morgan

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Graph query languages such as Cypher are widely adopted to match and retrieve data in a graph representation, due to their ability to retrieve and transform information. Even though the most natural way to match and transform information is through rewriting rules, those are scarcely or partially adopted in graph query languages. Their inability to do so has a major impact on the subsequent way the information is structured, as it might then appear more natural to provide major constraints over the data representation to fix the way the information should be represented. On the other hand, recent works are starting to move towards the opposite direction, as the provision of a truly general semistructured model (GSM) allows to both represent all the available data formats (Network-Based, Relational, and Semistructured) as well as support a holistic query language expressing all major queries in such languages. In this paper, we show that the usage of GSM enables the definition of a general rewriting mechanism which can be expressed in current graph query languages only at the cost of adhering the query to the specificity of the underlying data representation. We formalise the proposed query language in terms declarative graph rewriting mechanisms described as a set of production rules L→R while both providing restriction to the characterisation of L, and extending it to support structural graph nesting operations, useful to aggregate similar information around an entry-point of interest. We further achieve our declarative requirements by determining the order in which the data should be rewritten and multiple rules should be applied while ensuring the application of such updates on the GSM database is persisted in subsequent rewriting calls. We discuss how GSM, by fully supporting index-based data representation, allows for a better physical model implementation leveraging the benefits of columnar database storage. Preliminary benchmarks show the scalability of this proposed implementation in comparison with state-of-the-art implementations.

Version published to 10.3390/math12172677
Aug 28, 2024
Version published to 10.20944/preprints202408.0536.v1
Aug 8, 2024

Data Structures for Range Sorted Consecutive Occurrence Queries

This article has 2 authors:
1. Waseem Akram
2. Takuya Mieno
This article has no evaluationsLatest version Jan 21, 2026
A Discovery Technique for Expressive Yet Sound Process Models

This article has 3 authors:
1. Humam Kourani
2. Gyunam Park
3. Wil M.P. van der Aalst
This article has no evaluationsLatest version Jan 12, 2026
Structured Knowledge for Multi-hop QA: A Comparative Study of GraphRAG and RAG

This article has 2 authors:
1. Nimet Aksoy
2. Murat Osman Ünalır
This article has no evaluationsLatest version Dec 9, 2025

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Data Structures for Range Sorted Consecutive Occurrence Queries

A Discovery Technique for Expressive Yet Sound Process Models

Structured Knowledge for Multi-hop QA: A Comparative Study of GraphRAG and RAG