Simulating Multi-Model Data Evolution for Benchmarking Big Data Systems
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
This paper addresses the challenge of benchmarking multi-model data management systems capable of handling diverse and evolving data. Existing benchmarks are typically static, limited to specific models, and insufficient for evaluating cross-model interoperability or schema evolution. To overcome these limitations, we introduce a novel tool, called TransforMMer, that enables the generation of dynamic, customizable benchmarks from heterogeneous datasets. The tool combines schema inference, editing, transformation, and export within a unified graphical interface. It supports multiple data models and schema versions, facilitating comparative performance evaluation across systems. Experimental results on real-world datasets demonstrate its effectiveness and adaptability. To promote reproducibility and community adoption, we also provide DaRe, a curated repository of benchmark datasets.