Deviation from Power-Law Distribution when Scaling the Distribution of Marine Plankton Folds from Genomes to Communities
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
At different scales of living systems, biological entities appear to follow scaling laws, such as power laws, which are often explained from stochastic mechanisms. This is the case for the number of species in a community and the number of genes or protein folds in a genome. Resulting from evolutionary processes combining gene family duplications and expansions with selective pressures, the distribution of protein folds systematically follows a power law in all individually observed genomes. A small number of folds are highly prevalent, while the majority of folds appear only once per genome. However, previous studies on fold occurrence have focused on individual genomes, isolated from their community contexts. In the oceans, plankton communities consist of complex assemblages of species, each exhibiting variable relative abundances. We investigated the consequences of this variability on the composition and distribution of folds by considering the relative abundance of species. By annotating folds to genes of environmental genomes of plankton collected by the Tara Oceans expedition, we show that the relative abundance of folds deviates from the classical power law and instead follows a Type II Pareto distribution. This model, typically observed in other complex organizations such as economics, allows us to classify different categories of folds that exhibit biogeographical differences. Our results show that scaling fold distributions from individual genomes to species communities lead to a deviation from the expected behavior of simple power-law relationship towards a more complex model. This phenomenon could be linked with the variable complexity of marine planktonic ecosystems.