Characterizing a species-rich and understudied tropical insect fauna using DNA barcoding

This article has been Reviewed by the following groups

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Abstract

Background

West Africa has high biodiversity that is relatively understudied, especially for insects. Studies of West African arthropod diversity can therefore help address important questions regarding conservation, ecosystem services, insecticide use, and other species-control interventions in agriculture and disease management. We intensively sampled arthropods in Ghana using complementary trapping methods, generated DNA barcodes, and classified sequences by Barcode Index Numbers (BINs), a species proxy. Using this dataset, we investigate assemblage composition, temporal activity patterns, and the state of regional biodiversity sampling.

Results

Sequencing DNA from 95,996 individuals captured using Malaise, yellow pan, pitfall, Heath and Centre for Disease Control (CDC) traps, we identified 10,120 unique BINs. The rate of species accumulation did not approach an asymptote for any taxonomic group or trap type, indicating high biodiversity. The different trap types sampled different subsets of the local community, with the greatest similarity between yellow pan and pitfall traps. More insects and species (BINs) were trapped during the day than at night. Our dataset shared more BINs in the Barcode of Life Database with South Africa than with any other country, although this predominantly reflects the limited sampling and DNA sequencing campaigns in Africa.

Conclusion

This study more than doubles the published BINs for West Africa, offering insights into the biodiversity of an ecologically important but understudied taxon and region. Using multiple trap types allowed a more complete assessment of the local arthropod assemblage. The public release of these data will support and stimulate further taxonomic and ecological work in the region.

Article activity feed

  1. AbstractBackground West Africa has high biodiversity that is relatively understudied, especially for insects. Studies of West African arthropod diversity can therefore help address important questions regarding conservation, ecosystem services, and insecticide use and other species-control interventions in agriculture and disease management. We intensively sampled arthropods in Ghana using complementary trapping methods, generated DNA barcodes, and classified sequences by Barcode Index Numbers (BINs, a species proxy). Using this dataset, we investigate assemblage composition, temporal activity patterns, and the state of regional biodiversity sampling.Results Sequencing DNA from 95,996 individuals captured using Malaise, yellow pan, pitfall, Heath and Centre for Disease Control (CDC) traps, we identified 10,120 unique BINs. The rate of species accumulation did not approach an asymptote for any taxonomic group or trap type, indicating high biodiversity. The different trap types sampled different subsets of the local community, with greatest similarity between yellow pan and pitfall traps. More insects and species (BINs) were trapped during the day than at night. Our dataset shared more BINs in the Barcode of Life Database with South Africa than with any other country, although this predominantly reflects the limited sampling and DNA sequencing campaigns in Africa.Conclusions This study more than doubles the published BINs for West Africa, offering insights into the biodiversity of an ecologically important but understudied taxon and region. Using multiple trap types allowed a more complete assessment of the local arthropod assemblage. The public release of these data will support and stimulate further taxonomic and ecological work in the region.

    This work has been peer reviewed in GigaScience (see https://doi.org/10.1093/gigascience/giag028), which carries out open, named peer-review. These reviews are published under a CC-BY 4.0 license and were as follows:

    Reviewer 3:

    This paper describes a massive DNA barcoding project of arthropods in Ghana, West Africa with a dataset of 95,996 individuals and 10,120 BINs (Barcode Index Numbers). The publication is a major contribution to characterizing biodiversity of tropical insects in a poorly studied area, answering methodological questions concerning trap complementarity and temporal activity, and is also an invaluable resource to the public. The research is well structured, analyses are favorable and the manuscript is well written. I recommend acceptance after minor revisions to address a few clarifications and technical points.

    1. The manuscript acknowledges that only a subset of individuals was sequenced due to logistical constraints, and for Heath traps, selection was based on wet mass. While the authors argue that sub-sorting aimed to maximize diversity, this could still introduce biases in abundance estimates and BIN accumulation curves. Please include a brief discussion of how this sub-sampling might affect the conclusions (e.g., richness estimates, trap comparisons) and consider adding a sensitivity analysis in the supplement if feasible.

    2. The finding that South Africa shares the most BINs with Ghana despite geographic distance is interesting and attributed to sampling effort. However, the regression model explains only 3% of variance (R2=0.03), suggesting other factors may be at play. Please discuss potential biogeographic or ecological reasons (e.g., similar habitats, historical connectivity) that might contribute to this pattern, even if sampling effort is the dominant driver.

    3. The use of BINs as a species proxy is appropriate for this study, but the manuscript should briefly acknowledge known limitations (e.g., BINs may over- or under-split species, particularly in poorly studied taxa). A sentence or two in the Discussion would suffice, noting that BINs are a pragmatic tool for biodiversity assessment but not a replacement for formal taxonomy.

    4. Line 381: "insFect" should be "insect".

    5. able 1 and Table 2 are well-presented, but consider adding a footnote explaining that "BINs unique to trap type" means not found in other trap types in this study.

    6. Line 140: Specify the soap concentration used in pan and pitfall traps.

    7. Line 150: Clarify how "wet mass" was measured (precision, handling protocol).

    8. Line 156: Mention the success rate of PCR and sequencing (how many samples failed?).

    9. Line 360-379: The section on "Taxa of potential human importance" is interesting but could be strengthened by relating findings to local agricultural or health contexts. For example, what do the low numbers of crop pests or disease vectors imply for local management?

    10. Line 390-396: The conclusion could briefly highlight future directions, e.g., integrating morphological taxonomy with BINs, or using this dataset for metabarcoding studies.

    11. Line 228: "Neuroptera had the lowest completeness at 13.5%" - mention the sample size for this order.

    12. Line 302: "β = -1.92, p >0.05" - report the exact p-value. Transfer Authorization

  2. AbstractBackground West Africa has high biodiversity that is relatively understudied, especially for insects. Studies of West African arthropod diversity can therefore help address important questions regarding conservation, ecosystem services, and insecticide use and other species-control interventions in agriculture and disease management. We intensively sampled arthropods in Ghana using complementary trapping methods, generated DNA barcodes, and classified sequences by Barcode Index Numbers (BINs, a species proxy). Using this dataset, we investigate assemblage composition, temporal activity patterns, and the state of regional biodiversity sampling.Results Sequencing DNA from 95,996 individuals captured using Malaise, yellow pan, pitfall, Heath and Centre for Disease Control (CDC) traps, we identified 10,120 unique BINs. The rate of species accumulation did not approach an asymptote for any taxonomic group or trap type, indicating high biodiversity. The different trap types sampled different subsets of the local community, with greatest similarity between yellow pan and pitfall traps. More insects and species (BINs) were trapped during the day than at night. Our dataset shared more BINs in the Barcode of Life Database with South Africa than with any other country, although this predominantly reflects the limited sampling and DNA sequencing campaigns in Africa.Conclusions This study more than doubles the published BINs for West Africa, offering insights into the biodiversity of an ecologically important but understudied taxon and region. Using multiple trap types allowed a more complete assessment of the local arthropod assemblage. The public release of these data will support and stimulate further taxonomic and ecological work in the region.

    This work has been peer reviewed in GigaScience (see https://doi.org/10.1093/gigascience/giag028), which carries out open, named peer-review. These reviews are published under a CC-BY 4.0 license and were as follows:

    Reviewer 2:

    General Comments:

    This manuscript presents an impressive and highly valuable study that significantly advances our understanding of tropical arthropod diversity in West Africa. The sampling effort is extraordinary (nearly 100,000 individuals sequenced), and the dataset generated more than doubles the number of Barcode Index Numbers (BINs) publicly available for the region. The study is well-designed, employing multiple complementary trap types to capture diverse components of the arthropod community. The analyses are generally robust and appropriate for the research questions. The public release of this large dataset is a major contribution that will undoubtedly stimulate further taxonomic and ecological research in understudied tropical regions. The manuscript is clearly written and well-structured. I am generally in favour of acceptance after minor revisions.

    Specific Comments and Suggestions for Revision:

    1. Visual Documentation of Methods The manuscript would benefit from including representative photographs of each of the five trap types (Malaise, yellow pan, pitfall, Heath, CDC) as deployed in the field. This is particularly helpful for readers less familiar with entomological methods. Given potential space constraints in the main text, I recommend including these as a Supplementary Figure (e.g., a panel of five photos with concise captions). Please cite this figure in the Methods (Sampling) section.
    2. Robustness of Community Composition Analyses. The NMDS and PERMANOVA results convincingly show differences among trap types. However, the sequencing effort (and thus sample size) varied greatly among traps (e.g., Heath: 65,293 samples vs. CDC: 3,039 samples). Could the authors please clarify if the Bray-Curtis dissimilarity matrices used in these analyses were calculated on standardized or rarefied data to account for this large disparity in sample size? A brief note in the Methods (Data analyses) or figure legend would assure readers that the observed patterns are not primarily an artefact of sampling intensity. The finding of significantly higher diurnal catches (individuals and BINs) in Malaise traps is interesting. The discussion briefly mentions variance in thermal conditions. Could the authors expand the Discussion (Diurnal activity patterns) to include other potential ecological or methodological explanations? For example, might this reflect true peaks in flight activity for dominant taxa (Diptera, Hymenoptera), or could it be influenced by trap visibility or wind patterns differing between day and night? A sentence or two of speculation would enrich the interpretation. The authors transparently note that only 34 of 117 Malaise lots were fully sequenced and that spiders were removed from some analyses. In the Discussion, please add a short statement evaluating how these practical limitations might have influenced the key conclusions regarding trap complementarity and overall community completeness. For instance, does the high rate of BIN accumulation in Malaise traps (Supplementary Figure 6) suggest that sequencing the remaining lots might have yielded many additional unique BINs, potentially altering the estimated contribution of this trap type?
    3. Minor Editorial and Clarity Points: Line 381: There is a typo: "more insFect individuals" should be "more insect individuals". Figure 2 & 3 Citations in Text: The in-text citations for Figures 2 and 3 (e.g., lines 239, 274-277) are currently embedded in the legend descriptions copied from the PDF. These should be simplified to standard figure calls (e.g., "(Figure 2)", "(Figure 3A, B)") and the legend text removed from the main manuscript body.
  3. AbstractBackground West Africa has high biodiversity that is relatively understudied, especially for insects. Studies of West African arthropod diversity can therefore help address important questions regarding conservation, ecosystem services, and insecticide use and other species-control interventions in agriculture and disease management. We intensively sampled arthropods in Ghana using complementary trapping methods, generated DNA barcodes, and classified sequences by Barcode Index Numbers (BINs, a species proxy). Using this dataset, we investigate assemblage composition, temporal activity patterns, and the state of regional biodiversity sampling.Results Sequencing DNA from 95,996 individuals captured using Malaise, yellow pan, pitfall, Heath and Centre for Disease Control (CDC) traps, we identified 10,120 unique BINs. The rate of species accumulation did not approach an asymptote for any taxonomic group or trap type, indicating high biodiversity. The different trap types sampled different subsets of the local community, with greatest similarity between yellow pan and pitfall traps. More insects and species (BINs) were trapped during the day than at night. Our dataset shared more BINs in the Barcode of Life Database with South Africa than with any other country, although this predominantly reflects the limited sampling and DNA sequencing campaigns in Africa.Conclusions This study more than doubles the published BINs for West Africa, offering insights into the biodiversity of an ecologically important but understudied taxon and region. Using multiple trap types allowed a more complete assessment of the local arthropod assemblage. The public release of these data will support and stimulate further taxonomic and ecological work in the region.

    This work has been peer reviewed in GigaScience (see https://doi.org/10.1093/gigascience/giag028), which carries out open, named peer-review. These reviews are published under a CC-BY 4.0 license and were as follows:

    Reviewer 1:

    The manuscript "Characterising a species-rich and understudied tropical insect fauna using DNA Barcoding" by Hemprich-Bennett and co-authors provides DNA barcodes from 95,996 individuals sampled in Ghana using various trap systems. In total, 10,120 unique BINs were identified, including 4,939 that were newly generated. Most sampled taxa were Diptera, Coleoptera, and Lepidoptera. In addition, the authors compared the determined BINs with already published data at BOLD, revealing the greatest overlap in BIN sharing with South Africa. In my eyes, the topic of this manuscript is interesting and for suitable for a publication in "GigaScience" that is focusing on "big data" research. The amount of new sequence data for arthropods, in particular insects, is awesome and represents an important step to assess the (molecular) biodiversity, or better species diversity, of a super diverse region which has hardly been studied so far. The authors use state-of-the-art methods to analyze their data including the BOLD database and BIN approach. However, there are some points that should be added or discussed in a broader context (see below). In addition, please find some specific comments made via sticky notes on the PDF file of the manuscript.

    I feel that the authors should provide some more references on various topics, especially in the introduction but discussion, too.

    It would be nice to present some maps, photos of the collection sites, the sampling devices as well as the samples themselves as part of the main manuscript, documenting the efforts that were taken.

    A BIN does per se not represent a species, because the variability of the DNA barcode fragment and mitochondrial DNA in general can be affected by various effects, e.g., incomplete lineage sorting, Wolbachia infections (especially true for arthropods), phylogeographic events, hybridization, and others. As consequence, BIN sharing and splitting can be observed - and in fact such effects are more often found than expected. It is fully clear that such analysis cannot be done for the given dataset, but a discussion of these effects is important and has been lacking thus far.

    What happened with the vouchers and DNA extracts? It is obvious that the collected specimens will include a high number of undescribed species, therefore the deposition of the voucher specimens is highly important.

    In my eyes it would be interesting to provide a summary of the lengths of the barcodes that were studied. How many barcodes were complete with a length of 658 base pairs? How many were about 300 bp etc.? I think such analysis can be easily done and visualized.

    Please find some other specific suggestions for corrections or additions made via notes on the document file of the manuscript.