RoofeNet-Multi: A Multi-Annotator Dataset with Per-Image Vision–Language Model Annotations for Rooftop Photovoltaic Potential Estimation in the Czech Republic

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

RoofeNet-Multiis a rooftop photovoltaic (PV) interpretation dataset designed to preserve label uncertainty in nadir aerial imagery by releasing multiple independent annotations per image. It contains 5,359 building-centered orthophoto tiles from the Czech Republic. The release provides 7,402 human annotation sets collected via a web-based platform (mean 1.38 sets per image) and one matched vision–language model (VLM) annotation set per image (5,359 sets), generated with Google Gemini (model identifier gemini-2.5-pro-preview-03-25; see Methods). Among the 5,359 images, 3,857 have exactly one human set and 1,502 have two or more human sets. Each annotation set comprises axis-aligned bounding boxes for (a) roof planes (flat vs. pitched) and (b) rooftop obstructions relevant to PV placement (e.g., chimneys, vents, skylights). Unlike conventional benchmarks that publish a single “ground truth”, RoofeNet-Multisupports uncertainty-aware training and evaluation, studies of human–AI agreement under a shared schema, and preliminary PV potential estimation workflows that can account for ambiguity in roof interpretation.

Article activity feed