Inferring biodiversity from indicator species using co-occurrence network structure
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Biodiversity loss is accelerating worldwide, yet comprehensive assessment of species assemblages across environmental gradients remains impractical for ecological forecasting and conservation. Indicator species are therefore widely used as proxies for community composition, but existing approaches struggle to reconstruct full assemblages, particularly for rare or poorly sampled species. Moreover, most indicator-based studies focus on predicting aggregate biodiversity metrics such as species richness, rather than inferring the occurrences of the species that compose assemblages. Here we present an integrated framework combining indicator species analysis, ecological network theory, and machine learning to infer the occurrence of non-indicator species from assemblage data. By leveraging structural properties of species co-occurrence networks, the approach captures latent community structure without explicit environmental modeling, enabling species-level inference under sparse sampling. Across species abundance gradients, we identify three distinct regimes of predictability: high predictability for abundant species, reduced performance for species of intermediate prevalence, and unexpectedly strong predictability for rare species driven by strong co-occurrence structure with indicator species. By exploiting complementary information captured by multiple network metrics, the framework recovers species with diverse connectivity profiles and consistently outperforms richness-based or random indicator selection in both accuracy and coverage. Overall, this data-efficient approach provides a transferable pathway for biodiversity monitoring and forecasting, while offering new insights into the network organization of ecological communities.