Sampling bias obscures biodiversity patterns, reveals data gaps in priority conservation areas: a call for improved documentation
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Where and how species are sampled can shape biodiversity knowledge, spatial patterns, and data-driven conservation. In many Global South biodiversity hotspots, sampling remains uneven, and available data often lack the synthesis needed to assess region-wide gaps for effective conservation planning and priority-setting. This shortfall is common within conserved areas and key biodiversity areas (hereafter ‘priority conservation areas’ or PCAs). We demonstrate this case in the Philippines, one of the most biodiverse countries in the world, where longstanding biodiversity research and growing policy momentum support efforts to expand coverage of conserved areas. Drawing on over a century of species occurrence records made digitally accessible, we compiled and manually curated these data to assemble and analyze information on Philippine amphibians and squamate reptiles from multiple sources, assessing the spatial distribution of observed diversity in relation to PCAs. Results reveal strong spatial biases, with preserved specimens comprising the majority of records and largely shaping observed diversity patterns. Citizen-science data complement already well-sampled regions, while records from peer-reviewed literature contribute valuable documentation in poorly sampled areas. PCAs are proportionally well-sampled, although gaps and biases remain. Sampling effort and observed diversity were higher in larger PCAs, but this positive area effect diminishes with increasing topographic relief, highlighting large mountain ranges as persistent blind spots in biodiversity documentation. Notably, some areas of higher diversity occur outside established PCAs. We discuss implications of these biases and propose enabling mechanisms to improve primary biodiversity data collection. This study affirms the importance of integrating digitally accessible biodiversity data from multiple sources in revealing sampling gaps and biases, guiding future studies towards poorly sampled areas and informing conservation priorities.