Ribosomal RNA (rRNA) sequences from 33 globally distributed mosquito species for improved metagenomics and species identification

Curation statements for this article:
  • Curated by eLife

    eLife logo

    eLife assessment

    This manuscript generates a valuable new genetic resource for studying mosquitos and the pathogens that they carry. For 33 species of mosquitoes, the authors have sequenced and assembled the ribosomal RNA, which will dramatically improve the power of RNA sequencing in mosquitoes.

This article has been Reviewed by the following groups

Read the full article See related articles

Abstract

Total RNA sequencing (RNA-seq) is an important tool in the study of mosquitoes and the RNA viruses they vector as it allows assessment of both host and viral RNA in specimens. However, there are two main constraints. First, as with many other species, abundant mosquito ribosomal RNA (rRNA) serves as the predominant template from which sequences are generated, meaning that the desired host and viral templates are sequenced far less. Second, mosquito specimens captured in the field must be correctly identified, in some cases to the sub-species level. Here, we generate mosquito rRNA datasets which will substantially mitigate both of these problems. We describe a strategy to assemble novel rRNA sequences from mosquito specimens and produce an unprecedented dataset of 234 full-length 28S and 18S rRNA sequences of 33 medically important species from countries with known histories of mosquito-borne virus circulation (Cambodia, the Central African Republic, Madagascar, and French Guiana). These sequences will allow both physical and computational removal of rRNA from specimens during RNA-seq protocols. We also assess the utility of rRNA sequences for molecular taxonomy and compare phylogenies constructed using rRNA sequences versus those created using the gold standard for molecular species identification of specimens—the mitochondrial cytochrome c oxidase I (COI) gene. We find that rRNA- and COI-derived phylogenetic trees are incongruent and that 28S and concatenated 28S+18S rRNA phylogenies reflect evolutionary relationships that are more aligned with contemporary mosquito systematics. This significant expansion to the current rRNA reference library for mosquitoes will improve mosquito RNA-seq metagenomics by permitting the optimization of species-specific rRNA depletion protocols for a broader range of species and streamlining species identification by rRNA sequence and phylogenetics.

Article activity feed

  1. Author Response

    Reviewer #2 (Public Review):

    Weaknesses

    The author's approach, as with traditional approaches to molecular identification of vector species, relies on expert entomologists capable of identifying mosquitoes in the field which is rare in most places. The authors do not provide citations for the taxonomic keys used for morphological identification, which in many places are outdated or unavailable for specific locations.

    We have added references for taxonomic identification keys in lines 677–679.

    The authors give no explanation as to why they chose rRNA-seq as their method of next-generation sequencing, which is most commonly used for transcriptomics, instead of traditional DNA-based metagenomics which is more commonly used to define community relationships as would be more appropriate for this study.

    We have added a sentence in the Introduction (lines 65–66) to explain why RNA-seq is a frequent choice for surveillance and virus discovery in mosquitoes.

  2. eLife assessment

    This manuscript generates a valuable new genetic resource for studying mosquitos and the pathogens that they carry. For 33 species of mosquitoes, the authors have sequenced and assembled the ribosomal RNA, which will dramatically improve the power of RNA sequencing in mosquitoes.

  3. Reviewer #1 (Public Review):

    This paper provides de novo assembly of full-length 18S and 28S rRNA sequences from 33 mosquito species for whom no genome sequence exists. This is a very useful approach and dataset and provides a new tool by which wild-caught mosquitoes can be species-identified. Additionally, the existence of rRNA reference sequences will allow more effective depletion of these hyperabundant species of RNA prior to investing in RNA-seq of other cellular RNAs from a given sample. It is interesting how phylogenetic trees constructed using 28s rRNA compare to the more standard mitochondrial cytochrome c oxidase I gene. The availability of these data will be very useful for field entomologists and the method by which the rRNAs were obtained may be broadly useful for scientists contemplating a similar approach in less-studied species of medical or biological importance.

  4. Reviewer #2 (Public Review):

    The authors have provided a large dataset of ribosomal RNA sequences to assist in the molecular identification of rare and unstudied medically important mosquitoes in four locations with high biodiversity and mosquito-borne virus circulation, Cambodia, French Guiana, Madagascar, and the Central African Republic. This was accomplished using a non-traditional approach, rRNA seq, which could help in the identification of novel potential vectors of disease in hotspots of transmission. Their method uses previously published insect and non-insect-specific rRNA sequences from multiple locations to perform "depletion" of interfering rRNA. This method allowed the authors to create both 28s and 18s sequences for the identification of novel species of mosquito vectors with high reliability based on phylogenetic analysis and utility where traditional cytochrome oxidase subunit I sequences are not available for systematics.

    Strengths:
    The non-traditional approach used is well described and provides novel guidance for researchers undertaking similar studies.

    The depletion method described allowed the authors to identify mosquito rRNA sequences even in the instance of non-target RNA being present.

    Weaknesses:
    The author's approach, as with traditional approaches to molecular identification of vector species, relies on expert entomologists capable of identifying mosquitoes in the field which is rare in most places. The authors do not provide citations for the taxonomic keys used for morphological identification, which in many places are outdated or unavailable for specific locations.

    While next-generation sequencing is becoming more available, it is still largely unobtainable for researchers lacking resources and infrastructure which is common in locations similar to those the authors provide these data for.

    The authors give no explanation as to why they chose rRNA-seq as their method of next-generation sequencing, which is most commonly used for transcriptomics, instead of traditional DNA-based metagenomics which is more commonly used to define community relationships as would be more appropriate for this study.

  5. Reviewer #3 (Public Review):

    This manuscript generates a valuable new genetic resource for mosquito research. The ribosomal RNA (rRNA) data generated for 33 mosquito species will ultimately enable physical subtraction of rRNA from mosquito RNA preps prior to sequencing, something that has not been possible for most mosquito species. This will dramatically improve the power of RNA sequencing in the mosquito field. Since mosquitoes harbor many RNA viruses, this is very important and removes a major roadblock to the study of mosquitoes and their viruses.

    In addition, the authors seem to show that rRNA-based taxonomical identification of mosquitoes is superior to traditional COI-based taxonomy. This would be a very important finding if true, but the authors never unequivocally conclude this.