GIDS: Efficient Grayscale Image-based Exemplar Spatial Dataset Search Processing

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

In the data-driven era, dataset search has become a critical task in data science and engineering. Traditional spatial dataset search methods primarily rely on keyword or range queries, which are inadequate for capturing the user’s intent expressed through exemplar datasets. To address this limitation, this paper explores the problem of exemplar spatial dataset search, using exemplar datasets as input. A novel grayscale image-based similarity model is proposed, which maps the spatial distribution of datasets into grayscale images to capture detailed distribution features. Based on this model, a baseline search scheme, GIDS, is introduced. To further enhance search efficiency, an optimized search scheme, GIDS+, is presented, incorporating three key optimization strategies: a Morton code-based approach to accelerate similarity calculations and an \((\omega)\)-MSDtree-based and an upper-bound-based approach to enable efficient pruning during candidate filtering. Experiments conducted on three real-world spatial data repositories show that the proposed methods outperform existing approaches in terms of search efficiency, offering a new solution for spatial dataset search.

Article activity feed