RoofeNet-Multi: A Multi-Annotator Dataset with Per-Image Vision–Language Model Annotations for Rooftop Photovoltaic Potential Estimation in the Czech Republic
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
RoofeNet-Multiis a rooftop photovoltaic (PV) interpretation dataset designed to preserve label uncertainty in nadir aerial imagery by releasing multiple independent annotations per image. It contains 5,359 building-centered orthophoto tiles from the Czech Republic. The release provides 7,402 human annotation sets collected via a web-based platform (mean 1.38 sets per image) and one matched vision–language model (VLM) annotation set per image (5,359 sets), generated with Google Gemini (model identifier gemini-2.5-pro-preview-03-25; see Methods). Among the 5,359 images, 3,857 have exactly one human set and 1,502 have two or more human sets. Each annotation set comprises axis-aligned bounding boxes for (a) roof planes (flat vs. pitched) and (b) rooftop obstructions relevant to PV placement (e.g., chimneys, vents, skylights). Unlike conventional benchmarks that publish a single “ground truth”, RoofeNet-Multisupports uncertainty-aware training and evaluation, studies of human–AI agreement under a shared schema, and preliminary PV potential estimation workflows that can account for ambiguity in roof interpretation.