Message in a Bottle—Metabarcoding enables biodiversity comparisons across ecoregions

D Steinke
S L deWaard
J E Sones
N V Ivanova
S W J Prosser
K Perez
T W A Braukmann
M Milton
E V Zakharov
J R deWaard
S Ratnasingham
P D N Hebert

This article has been Reviewed by the following groups

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

Evaluated articles (GigaScience)

Abstract

Background

Traditional biomonitoring approaches have delivered a basic understanding of biodiversity, but they cannot support the large-scale assessments required to manage and protect entire ecosystems. This study used DNA metabarcoding to assess spatial and temporal variation in species richness and diversity in arthropod communities from 52 protected areas spanning 3 Canadian ecoregions.

Results

This study revealed the presence of 26,263 arthropod species in the 3 ecoregions and indicated that at least another 3,000–5,000 await detection. Results further demonstrate that communities are more similar within than between ecoregions, even after controlling for geographical distance. Overall α-diversity declined from east to west, reflecting a gradient in habitat disturbance. Shifts in species composition were high at every site, with turnover greater than nestedness, suggesting the presence of many transient species.

Conclusions

Differences in species composition among their arthropod communities confirm that ecoregions are a useful synoptic for biogeographic patterns and for structuring conservation efforts. The present results also demonstrate that metabarcoding enables large-scale monitoring of shifts in species composition, making it possible to move beyond the biomass measurements that have been the key metric used in prior efforts to track change in arthropod communities.

GigaScience
Oct 10, 2022
biomonitoring

**Reviewer 4. Christina Lynggaard **

This manuscript assesses the variation in arthropod communities in three ecoregions in Canada. The study is well done, and the sampling was very thorough with a big sampling effort. I only have minor comments. Specially I consider that the aim can be focused on the ecoregions instead of the feasibility of the method, as this has already been shown. In addition, it would be nice to have more details in certain sections in the data analyses and in the results. I have addressed these comments below. -I am not sure why the title "Message in a bottle". -Line 65- Could you specify which indicator species have been targeted? Or cite studies that target those species?
- Line 96- Based on the limitations of the ecoregions, it is not clear why ecoregions are an obvious candidate. -In line 104 …
biomonitoring

**Reviewer 4. Christina Lynggaard **

This manuscript assesses the variation in arthropod communities in three ecoregions in Canada. The study is well done, and the sampling was very thorough with a big sampling effort. I only have minor comments. Specially I consider that the aim can be focused on the ecoregions instead of the feasibility of the method, as this has already been shown. In addition, it would be nice to have more details in certain sections in the data analyses and in the results. I have addressed these comments below. -I am not sure why the title "Message in a bottle". -Line 65- Could you specify which indicator species have been targeted? Or cite studies that target those species?

Line 96- Based on the limitations of the ecoregions, it is not clear why ecoregions are an obvious candidate. -In line 104 seems that your aim is to demonstrate how feasible is to use metabarcoding for large-scale monitoring and that you use the ecoregions to prove that. However, showing the feasibility of this method for large-scale studies has already been done (e.g. Svenningsen et al 2021, Detecting flying insects using car nets and DNA metabarcoding; Bush et al 2020, DNA metabarcoding reveals metacommunity dynamics in a threatened boreal wetland wilderness). I suggest keeping it focused on the need to apply this method in different ecoregions. -In the Data description section, you mention that you examined phylogenetic diversity, but in the Analyses section you vaguely mention it. The phylogenetic diversity findings are discussed later on, but it is difficult to follow the discussion when the results were not presented previously. In addition, the authors use the findings in phylogenetic diversity to support the idea of a structure in the ecoregions, so I suggest making more emphasis in this in the results section. -Line 189. I agree that the higher number of BINs could be due to eDNA, but couldn't another reason be that the BINs were oversplit during data analysis? -Line 215-217. Has this been found previously in other studies using Malaise trap? If so, please reference to those findings. -Line222- This is a brief discussion about temporal turnover. However, these results are not presented previously, or at least not clearly enough. -Line 266-267- Yes, you showed compositional shifts using metabarcoding in bulk arthropod samples, but the way this sentence is structured it sounds like you are the first to show this. Compositional shifts in arthropods have been shown previously in other studies using metabarcoding. -Line 321- Did you have negative PCR controls? In line 326 you mention negative controls, but I assume you refer to the extraction negative controls. -Line 340- It is not clear why you queried the data against a bacterial library. -Line 348- What was the reason for choosing "at least three reads"? and the same for line 350 where you cluster sequences with a minimum of 5 reads per cluster. -Line 357- If you see tag switching in your negative controls that means that most likely you have it in the rest of the data. How did you ensure that the rest of the data did not have that? You may have tags switching in sequences not found in the negative controls but found in your samples. -Line 369- As you used the Bray-Curtis index in this metabarcoding data, did you convert your data to presence/absence? It is known that for metabarcoding data the use of read numbers for community analysis is not adequate (see Nichols et al 2018 "Minimizing polymerase biases in metabarcoding") .
Read the original source
GigaScience
Oct 10, 2022

Traditional

**Reviewer 3. Kingsly Beng **

Steinke et al used DNA metabarcoding of malaise trap samples from 52 protected areas spanning three Canadian ecoregions to assess the spatial patterns of arthropod biodiversity. The research question is relevant and interesting, the study is well designed, data collected are comprehensive, and manuscript is well written and easy to follow. I enjoyed reading it and would like to thank the authors for such a great contribution. My main concern is that the temporal aspect of the study was not explored even though it was mentioned as part of the research objective. Specific comments L60-62: These reductions are not only for abundance but also for diversity, at least based on the fourth reference cited here. I would therefore include "diversity" or "richness" in this statement. L63 & L105: The …

Traditional

**Reviewer 3. Kingsly Beng **

Steinke et al used DNA metabarcoding of malaise trap samples from 52 protected areas spanning three Canadian ecoregions to assess the spatial patterns of arthropod biodiversity. The research question is relevant and interesting, the study is well designed, data collected are comprehensive, and manuscript is well written and easy to follow. I enjoyed reading it and would like to thank the authors for such a great contribution. My main concern is that the temporal aspect of the study was not explored even though it was mentioned as part of the research objective. Specific comments L60-62: These reductions are not only for abundance but also for diversity, at least based on the fourth reference cited here. I would therefore include "diversity" or "richness" in this statement. L63 & L105: The authors use biosurveillance in some places in the text and bio-surveillance in others. Isn't it better to stick to the same spelling all through, at least for consistency? L132: I am a bit confused here. Are these "Analyses" or "Results"? The whole subsection from L133-L176 read like results to me. L329: "of" omitted! Five samples were available from each of the other 22 sites... L332-334: The first "following" in this sentence can be either omitted or that part of the sentence completed using "manufacturer's instructions" L345-346: "Reads were trimmed 30 bp from their 5' terminus with a set trim length of 450 bp". Perhaps this needs more clarification. The amplified length was 463 bp, trimming 30 bp gives 433 bp. How then can set trim length be 450 bp? L348-349: What was the criterion for using "at least three reads matched an OTU in the reference database"? I mean why not at least two or at least four reads? If this was arbitrary please clarify. L349-350: Same question as above, why use "a minimum of five reads per cluster"? It would be nice to indicate if any benchmarking was applied a priori or if this was set arbitrarily. L346-349: Since the authors were mostly interested in arthropods, were reads that matched sequences from bacteria (SYS-CRLBACTERIA), chordates (SYS-CRLCHORDATA) and non-arthropod invertebrates (SYS CRLNONARTHINVERT) discarded or retained? This should be mentioned here and estimates of the number of reads, BINs or OTUs matching each of these categories should be provided. L149-153: These are interesting results. It would be nice to present them graphically, at least in the supplementary. The aim of the study was "to assess spatial and temporal variation in species richness and diversity in arthropod communities from 52 protected areas spanning three Canadian ecoregions" but the temporal aspect of the study was not fully explored. Although it is stated that "trap catches were harvested every second week from early May through September", this information has not be used in the analysis. Should the aim of the study be redefined and restricted to just spatial patterns then? L152-153: Without any table or figure to support these results, why not provide the actual number or proportion or percentage of BINs for each arthropod order in the text? L157-158: Please add some symbols (e.g. asterisks *, **, ***or alphabet a, b, c) to Figure 3b to represent significant differences. Looking at the present figure without referring to the text does not tell the reader if the differences are significant. Besides, the authors only report a single p value (p < 0.003) which probably means at least one of the groups is different from the others but failed to report the pairwise multiple comparison tests that tell the reader which pairs or groups (e.g. ECF vs EGL, ECF vs SGL, EGL vs SGL) are significantly different. L159: Are the patterns similar if you control for the total number of sites per ecoregion? For example, taking 12 sites per ecoregion and resampling them 100 or 1000 times, similar to the approach used for beta diversity. It could be that one site is driving this pattern, as shown in Figure 2b and reported in L141 "...with more than a third (9,301) found at only one site (Figure 2b)". L164-166: Please provide the full PERMANOVA results in a table in the text or supplementary and reference it here. It is not clear what "decreased site elevation (R2 166 =â€‰0.035, P =â€‰0.03)" means. L168-171: Do these patterns change or remain the same if the same number of sites per ecoregion is used? This needs to be tested given that one site (probably from ECF or EGL?) is disproportionate species-rich and SGL has the lowest number of sites. L173-176: What about levels of turnover across time? Were they any temporal trends in alpha and beta diversity? Was the temporal dropped from the study objective and why? L221-223: Same question as above, were temporal changes in species composition considered? Which results, tables or figures point to this or how did the authors arrive at these statements.

Read the original source
GigaScience
Oct 10, 2022
Background

**Reviewer 2. Shanlin Liu **

Steinke et al. used a metebarcoding method to investigate the species compositions for 410 insect bulk samples collected in 3 ecoregions. The manuscript is well written, all the materials and methods were clearly described, I think the manuscript should be accepted for publication after addressing several minor issues as follows:
1. Line 126, as Ion torrent is not widely used nowadays, may the authors add some words regarding its sequencing length, error rate, throughput et al.
2. Please unify the format of chao 1 (or chao-1).
3. A rarefaction curve for each sample may need to check whether the species diversity is well represented by its raw reads.
4. Line 187 - 191. This BIN number inflation may also boil down to sequence errors introduced during PCR amplification or sequencing.
5. Please pay attention to the …
Background

**Reviewer 2. Shanlin Liu **

Steinke et al. used a metebarcoding method to investigate the species compositions for 410 insect bulk samples collected in 3 ecoregions. The manuscript is well written, all the materials and methods were clearly described, I think the manuscript should be accepted for publication after addressing several minor issues as follows:

Line 126, as Ion torrent is not widely used nowadays, may the authors add some words regarding its sequencing length, error rate, throughput et al.

Please unify the format of chao 1 (or chao-1).

A rarefaction curve for each sample may need to check whether the species diversity is well represented by its raw reads.

Line 187 - 191. This BIN number inflation may also boil down to sequence errors introduced during PCR amplification or sequencing.

Please pay attention to the citation format. For example, in line 202, reference # 40 should follow the first author's name.

Line 226 - 227, please add some words to better explain the speculation of "passively transported by wind".
Read the original source
GigaScience
Oct 10, 2022

Abstract

This work has been published in GigaScience Journal under a CC-BY 4.0 license (https://doi.org/10.1093/gigascience/giac040, and has published the reviews under the same license. These are as follows.

**Reviewer 1. Camila Duarte Ritter **

The manuscript is very well written and a great contribution to the field. However some analytical aspects need to be better described. Also, it would be great the authors provide their R-script in the supplementary material. Below my comments. Line 166: R2 = 0.035 is very low, it needs to be better considered. Lines 168-171: The alpha diversity comparison was based just in visual inspection or any test was made? Lines 173-176: There was any test to significance? It need to be reported. Lines 213-219: It is a nice discussion about local versus regional diversity, but very speculative, need at …

Abstract

This work has been published in GigaScience Journal under a CC-BY 4.0 license (https://doi.org/10.1093/gigascience/giac040, and has published the reviews under the same license. These are as follows.

**Reviewer 1. Camila Duarte Ritter **

The manuscript is very well written and a great contribution to the field. However some analytical aspects need to be better described. Also, it would be great the authors provide their R-script in the supplementary material. Below my comments. Line 166: R2 = 0.035 is very low, it needs to be better considered. Lines 168-171: The alpha diversity comparison was based just in visual inspection or any test was made? Lines 173-176: There was any test to significance? It need to be reported. Lines 213-219: It is a nice discussion about local versus regional diversity, but very speculative, need at least some citations to support it. Lines 357-358: It reduce background contamination, you never can remove all. Lines 365-367: How the distances were controlled, any analysis of spatial correlation? Lines 367_370: The NMDS was with abundance or presence/absence data? If it was abundance, any correction was applied? Lines 374-376: How the author checked the quality of the tree as it was made with very short fragment? the blackbox toll set all parameters on the model? Line 382: Was there any correction to BINs table? Rarefaction, Shannon entropy? It is very necessary to metabarcoding data. Also why just BIN richness, other diversity measures may be included as Shannon or Fisher diversity on phyloseq, or the effective number of BINs with entropart. Figure 1 needs a reference to Canada to better understand where the region is.

Re-review:

The study is very well designed and written, with good and clear results. The author had considered all my comments from before, just some additional minor comments are below. Lines 118-119: species (bin) richness is a measure of alpha diversity and change in community composition a measure of beta diversity. Lines 112-122: Malaise-traps collect some random local no flighting insects, while discuss that it represent local population is ok I miss the part of the random sampling and that the lack of such insects in the samples does not exactly mean the non-presence of these insects. Lines 243-246: The sentence "Although current metabarcoding protocols cannot estimate the abundance of each species" is not completely right. Currently many metabarcoding studies estimate abundance/biomass of species, some discussion of it is necessary. Some examples (among several others):

Elbrecht, V., & Leese, F. (2015). Can DNA-based ecosystem assessments quantify species abundance? Testing primer bias and biomass sequence relationships with an innovative metabarcoding protocol. PloS one, 10(7), e0130324. Thomas, A. C., Deagle, B. E., Eveson, J. P., Harsch, C. H., Trites, A. W. (2016). Quantitative DNA metabarcoding: improved estimates of species proportional biomass using correction factors derived from control material. Molecular ecology resources, 16(3), 714-726. Di Muri, C., Lawson Handley, L., Bean, C. W., Li, J., Peirson, G., Sellers, G. S., ... & HÃ¤nfling, B. (2020). Read counts from environmental DNA (eDNA) metabarcoding reflect fish abundance and biomass in drained ponds. Metabarcoding and Metagenomics, 4, 97-112. Ershova, E. A., Wangensteen, O. S., Descoteaux, R., Barth-Jensen, C., & PrÃ¦bel, K. (2021). Metabarcoding as a quantitative tool for estimating biodiversity and relative biomass of marine zooplankton. ICES Journal of Marine Science, 78(9), 3342-3355.

For the figures comparing the ecoregions, as they are just three I would recommend a color blind safe palette, orange, yellow and green is not nice.

Read the original source
Version published to 10.1093/gigascience/giac040
Jan 1, 2022
Version published to 10.1101/2021.07.05.451165 on bioRxiv
Jul 6, 2021

Deciphering the patterns and drivers of tardigrade diversity along altitudinal gradients

This article has 4 authors:
1. Bartłomiej Surmacz
2. Diego Fontaneto
3. Grzegorz Vončina
4. Daniel Stec
This article has no evaluationsLatest version Dec 15, 2025
Environmental DNA reveals differential geologic isolation effects on plant and fungal Communities in the Hengduan Mountains

This article has 11 authors:
1. Yaquan Chang
2. Yifan Wang
3. Xianjun Fang
4. Ao Luo
5. Zhiheng Wang
6. Wenjun Zhong
7. Xiaowei Zhang
8. Camille Albouy
9. Niklaus Zimmermann
10. Sean Willett
11. Loïc Pellissier
This article has no evaluationsLatest version Jan 28, 2026
Evidence for habitat fragmentation induced genetic degradation in remnant Syzygium maire (Myrtaceae) populations

This article has 5 authors:
1. Colan G Balkwill
2. Emily Koot
3. Peter Ritchie
4. David Chagné
5. Julie R Deslippe
This article has no evaluationsLatest version Jan 12, 2026

This article has been Reviewed by the following groups

Discuss this preprint

Listed in

Abstract

Background

Results

Conclusions

Article activity feed

Related articles

Deciphering the patterns and drivers of tardigrade diversity along altitudinal gradients

Environmental DNA reveals differential geologic isolation effects on plant and fungal Communities in the Hengduan Mountains

Evidence for habitat fragmentation induced genetic degradation in remnant Syzygium maire (Myrtaceae) populations