Single mosquito metatranscriptomics identifies vectors, emerging pathogens and reservoirs in one assay
This article has been Reviewed by the following groups
Listed in
- Evaluated articles (Review Commons)
Abstract
Mosquitoes are major infectious disease-carrying vectors. Assessment of current and future risks associated with the mosquito population requires knowledge of the full repertoire of pathogens they carry, including novel viruses, as well as their blood meal sources. Unbiased metatranscriptomic sequencing of individual mosquitoes offers a straightforward, rapid, and quantitative means to acquire this information. Here, we profile 148 diverse wild-caught mosquitoes collected in California and detect sequences from eukaryotes, prokaryotes, 24 known and 46 novel viral species. Importantly, sequencing individuals greatly enhanced the value of the biological information obtained. It allowed us to (a) speciate host mosquito, (b) compute the prevalence of each microbe and recognize a high frequency of viral co-infections, (c) associate animal pathogens with specific blood meal sources, and (d) apply simple co-occurrence methods to recover previously undetected components of highly prevalent segmented viruses. In the context of emerging diseases, where knowledge about vectors, pathogens, and reservoirs is lacking, the approaches described here can provide actionable information for public health surveillance and intervention decisions.
Article activity feed
-
-
Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.
Learn more at Review Commons
Reply to the reviewers
We are grateful for the careful read and constructive comments provided by the 3 reviewers assigned to our manuscript. Each reviewer provided thoughtful and clearly structured comments that helped us to better clarify points or summarize results in the manuscript that they indicated were not presented clearly or completely. We have revised the manuscript to address the points raised by the reviewers, incorporating edits and additional text throughout the manuscript, figure legends, and supplemental materials. We feel the revised version of the manuscript is much improved as a result of the revisions in response to the reviewers.
-
Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.
Learn more at Review Commons
Referee #3
Evidence, reproducibility and clarity
Summary:
The authors demonstrate a powerful method utilizing mNGS of individual mosquitoes utilizing reference-free analysis. This allows researchers to combine the resulting datasets of mosquito identification, blood-meal source, microbiome, viral sequencing, etc. Such knowledge could be a useful tool in detecting and responding to transmission of mosquito-borne diseases that affect human or animal populations, even though the technology is currently likely too expensive for widespread use (as acknowledged by the authors).
Major Comments:
No major revisions requested.
The authors provide their detailed methodology, including code, …
Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.
Learn more at Review Commons
Referee #3
Evidence, reproducibility and clarity
Summary:
The authors demonstrate a powerful method utilizing mNGS of individual mosquitoes utilizing reference-free analysis. This allows researchers to combine the resulting datasets of mosquito identification, blood-meal source, microbiome, viral sequencing, etc. Such knowledge could be a useful tool in detecting and responding to transmission of mosquito-borne diseases that affect human or animal populations, even though the technology is currently likely too expensive for widespread use (as acknowledged by the authors).
Major Comments:
No major revisions requested.
The authors provide their detailed methodology, including code, allowing for replication by other groups.
Minor Comments:
The authors' discussion of using this technique in order to detect pathogens should be qualified regarding detection vs possible transmission. Detecting a virus in an engorged mosquito does not necessarily mean that said mosquito can transmit the virus, but may have simply acquired it from a recent blood meal. The same can be said of detecting a plant pathogen following a recent sugar meal.
From the methods, it seems that mosquitoes were not washed prior to processing. This may make it difficult to discriminate between internal and external microbiota as well as lead to cross-contamination of surface microbiota between mosquitoes collected in the same trap.
Significance
This work currently would be of interest to other research groups examining the co-occurence of pathogens, other microbiota, and blood meals for field collected mosquitoes. While of great potential application to public health surveillance, the current cost is likely prohibitive.
My field of expertise is virology and vector biology with minimal background in NGS.
-
Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.
Learn more at Review Commons
Referee #2
Evidence, reproducibility and clarity
Summary:
In this study, the authors utilized unbiased meta-transcriptomic in sequencing 148 diverse wild-caught mosquitoes (Aedes, Culex, and Culiseta mosquito species) collected in California, with main aim of detecting sequences of eukaryotic, prokaryotic and viral origin. Their results show that majority of their sequenced data assembled into contigs corresponding to viral genomes. In their data, 7.4 million viral reads clustered as +ssRNA viruses including Solemoviridae, Luteoviridae, Tombusviridae, Narnaviridae, Flaviviridae, Virgaviridae, and Filovirida whereas 2.25 million viral reads identified as -ssRNA viruses …
Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.
Learn more at Review Commons
Referee #2
Evidence, reproducibility and clarity
Summary:
In this study, the authors utilized unbiased meta-transcriptomic in sequencing 148 diverse wild-caught mosquitoes (Aedes, Culex, and Culiseta mosquito species) collected in California, with main aim of detecting sequences of eukaryotic, prokaryotic and viral origin. Their results show that majority of their sequenced data assembled into contigs corresponding to viral genomes. In their data, 7.4 million viral reads clustered as +ssRNA viruses including Solemoviridae, Luteoviridae, Tombusviridae, Narnaviridae, Flaviviridae, Virgaviridae, and Filovirida whereas 2.25 million viral reads identified as -ssRNA viruses comprising of Peribunayviridae, Phasmaviridae, Phenuiviridae, Orthomyxoviridae, Chuviridae, Rhabdoviridae, and Ximnoviridae. With 0.94 million viral reads, dsRNA viruses formed the third most abundant virus category with viruses under families Chrysoviridae, Totiviridae, Partitiviridae, and Reoviridae. Under the prokaryotic taxa, Wolbachia species was the dominant group, followed by other lower abundance bacterial taxa that includes Alphaproteobacteria, Gammaproteobacteria, Terrabacteria group, and Spirochaetes. Trypanosomatidae was the most dominant eukaryotic taxa, followed up by reads from Bilateria and Ecdysozoa taxa. Ultimately, this study demonstrates that single mosquito meta-transcriptomic analysis has potential in identifying vectors of human health significance, potent emerging pathogens being transmitted by them and their reservoirs all in one assay.
Major comments:
1.Are the key conclusions convincing? The conclusions are accurate.
2.Should the authors qualify some of their claims as preliminary or speculative, or remove them altogether? None. The study's results, discussion and conclusion are appropriate.
3.Would additional experiments be essential to support the claims of the paper? Request additional experiments only where necessary for the paper as it is, and do not ask authors to open new lines of experimentation.
As much as the authors describe the use of mNGS as a tool in validating mosquito species and providing an unbiased look at the vector-associated pathogens, it is still prudent for them to use qPCR to validate the obtained RNASeq data (e.g. validation of the viral sequences).
4.Are the suggested experiments realistic in terms of time and resources? It would help if you could add an estimated cost and time investment for substantial experiments. The outlined methodology is realistic.
5.Are the data and the methods presented in such a way that they can be reproduced? The methodology is reproducible.
6.Are the experiments adequately replicated and statistical analysis adequate? Yes
Minor comments:
1.Specific experimental issues that are easily addressable. qPCR validation the obtained RNASeq data should be conducted.
2.Are prior studies referenced appropriately? The recently publications about mosquito microbiome/virome should be added. (eg. doi: 10.1128/mSystems.00640-20.)
3.Are the text and figures clear and accurate? The resolution for Fig 4, Fig 6, SFig 2, SFig 4, and SFig 5 is poor. The author should update them.
4.Do you have suggestions that would help the authors improve the presentation of their data and conclusions? (1)in the method section, the mosquito has been washed to avoid the contamination from the environment before RNA extraction? (2)most part of non-host reads are matched to the viruses (10.5M), however only few of them were belong to the prokaryotes, does it means mosquito carries more viruses than prokaryotes. (3)none of the mosquito-borne virus known to occur in California (eg. WNV, SLEV, WEEV, ) has been found in Table 1 for the virus detected with complete genome in this study. In contigs level, did the author detected any mosquito-borne virus known to occur in California. Since the mNGS is very sensitive and this study include large sample numbers, why no known mosquito-borne virus was detected in their study should be discussed.
Significance
1.Describe the nature and significance of the advance (e.g. conceptual, technical, clinical) for the field. With the existential threat of emerging novel pathogens of global health concern, efficient and rapid public health surveillance strategies are crucial in monitoring and possibly averting such eventual calamities. Specifically, mosquitoes are widely diverse and are known to harbor and transmit various pathogenic agents to humans and animals. Thus, this rapid identification of relevant vector species, pathogens and their reservoirs in one assay is a promising and convenient aspect of surveillance in the public health sector.
2.Place the work in the context of the existing literature (provide references, where appropriate). Shi et al reported the first single mosquito viral metagenomics study, in which her and the team demonstrated the feasibility of using single mosquito for viral metagenomics, a methodology that has potential to provide much more precise virome profiles of mosquito populations. In the present study, the authors have gone a step higher by aiming to combine three objective points in single mosquito meta-transcriptomic, as described in brief in their abstract and the comprehensive methodology outline.
Reference: Shi, C., Beller, L., Deboutte, W. et al. Stable distinct core eukaryotic viromes in different mosquito species from Guadeloupe, using single mosquito viral metagenomics. Microbiome 7, 121 (2019). https://doi.org/10.1186/s40168-019-0734-2
3.State what audience might be interested in and influenced by the reported findings. The methodology and findings described in this manuscript are important in advancing the public health field of vector surveillance. The identification of relevant vector species, pathogens and their reservoirs in one assay is a promising and convenient aspect of surveillance.
4.Define your field of expertise with a few keywords to help the authors contextualize your point of view. Indicate if there are any parts of the paper that you do not have sufficient expertise to evaluate.
I am an Associate Professor at a research institute. My lab research work focuses on Arbovirology studies, more specifically vector surveillance of known and novel viruses associated with mosquitoes and ticks, mosquito-transcriptomic studies, mosquito viruses tropism studies and other related mosquito-virus interaction studies.
-
Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.
Learn more at Review Commons
Referee #1
Evidence, reproducibility and clarity
This is a very interesting and well designed study on mNGS of mosquitoes. The authors demonstrate that they can distill valuable information on the vector species, the source of the blood meals and the microbiome/virome using a simple experimental approach and using single mosquitoes. A highlight of the work is that the paper is very comprehensive with an overwhelming dataset and thoughtful analysis. It is a showcase how sequencing data from a relative compact number of mosquitoes specimens can be used to conduct sophisticated computational analysis leading to meaningful conclusions. The authors make a strong case for the power of …
Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.
Learn more at Review Commons
Referee #1
Evidence, reproducibility and clarity
This is a very interesting and well designed study on mNGS of mosquitoes. The authors demonstrate that they can distill valuable information on the vector species, the source of the blood meals and the microbiome/virome using a simple experimental approach and using single mosquitoes. A highlight of the work is that the paper is very comprehensive with an overwhelming dataset and thoughtful analysis. It is a showcase how sequencing data from a relative compact number of mosquitoes specimens can be used to conduct sophisticated computational analysis leading to meaningful conclusions. The authors make a strong case for the power of mNGS of mosquitoes that may be applicable to other (invertebrate) species. Especially the phylogenetic analysis based on SNP distance without have reference genomes and the grouping of contigs by means of co-occurence in datasets is original. We feel that the work deserves to be published.
Significance
We have a number of comments that the authors may consider in further improving the quality of their manuscript:
What is the impact of this paper?
I think it is possible that the paper will have a decent impact on the mosquito arbovirus field, because it adequately shows the possibilities that individual mosquito sequencing can bring (e.g. co-occurrence analysis). It may shift the balance to doing more individual mosquito sequencing instead of pools. The paper is also very extensive in the analyses that it does on this very rich data set. Below, some suggestions are given for additional analysis, which should be interpreted as a compliment to the interesting data set acquired. It should however be noted that the ideas and approaches taken are not entirely new. Sequencing individual mosquitoes, co-occurrence analysis and metagenomic sequencing have been done before, although not to this extent and not in this field. Several novel possibilities:
- An unbiased way to check if you have the correct mosquito species and the ability to detect subspecies. Using the genetic distance of the transcriptomes they have likely corrected the missed identification in some samples, where these calls had a logical mistake made. The fact that subspecies overlapped with the sites of capture is very interesting and confirms the relevance of looking at the genetic distance also within species.
- Blood-meal analysis from sequence data. Here they can get to species level for 10 out of 40 blood-engorged mosquitoes. The idea is interesting, as you would be able to get a lot more information if you can determine blood-meal origin from RNA-seq data (as shown in this paper). However, I feel that in the current paper (and this may be intentional) they do not properly show that RNA-seq is an adequate alternative to DNA sequencing of the blood. To convince me, I would have liked to have these results compared to DNA sequencing and see how much overlap there is. I understand however that the choice was made not to do this, but I do have a small note for the information given now. It was mentioned that 1 contig with an LCA of vertebrates is enough for a 'blood-meal origin' call. I am however left to wonder how reliable is 1 read? Are there really no contigs with an LCA in vertebrates in the non blood-fed mosquitoes? Also, what do we think happened in the mosquitoes that were visibly bloodfed but nothing was found; any speculation?
- The study of co-occurrence, although not novel, is a nice addition to the mosquito virome/microbiome determination field. Identifying novel segments and missed segments of viruses is very nice. I do however wonder: did it ever occur that co-occurrence finds a 'linked' fragment that was clearly wrong? Were some post-analyses done to check if the results make sense? It seems, especially because the paper elaborates on examples, that you need some follow-up. This is not problematic, but a nice addition to the paper would be (as is also described below) to mention which segments were added to viral genomes by co-occurrence and if some checks were done to verify these hits.
- Being able to say something about differences in viruses within the same mosquito species is super interesting. Pools do not give the possibility to say something about profiles and prevalence and the large size (148 mosquitoes) allows to find interesting correlations.
What parts do you think are problematic?
- We question the validity 'blood-meal calls' as outlined above.
- In this study they use % of non-host reads as a measure for the abundance of a pathogen (see e.g. Figure 3). I don't understand this at all... If you have more pathogens, then the amount of non-host reads would have to go up right? It seems to assume that the amount of non-host reads you have is similar in all samples? It becomes even more problematic when the trend is mentioned that having a higher % of non-host reads for Wolbachia is related to a lower % of non-host reads for viruses. This seems to be trivial as the amount of non-host reads goes up with increased Wolbachia infection, and therefore the % of non-host reads for viruses goes down due to the larger denominator. A different number than 'non-host reads' should be taken to normalise the data and say something about abundance. E.g. host reads or spiked RNA?
What are the most relevant questions you are left with?
- I am curious about the limited overlap with Sadeghi et al., 2018, who sequenced so many Culex mosquitoes in California. I would suggest to say a little but more about these discrepancies and their potential causes in the discussion.
- What do the authors think are in those 'dark reads'? Is the amount of dark reads the same across the different samples? Similarly, are the 'tetrapoda' reads reduced/absent in mosquitoes with a reference genome available?
- In the first part of the results, mention is made to being able to characterize to kingdom level 77% of the 13 million non-host reads (also see comment on non-host reads below). I am however puzzled with the description in the text and supplemental figure 3: which 3 million contigs were not able to be characterized? Where in supplemental figure 3 are they? This is especially puzzling as the main text mentions that 11 million non-host reads are from complete viral genomes, 0.9 million to eukaryotic taxa and 0.7 million to prokaryotic taxa?
- There seem to be 131 bars, corresponding to individual mosquitoes, in figure 3? Where are the remaining 17?
What are your tips (in addition to responses to above questions)?
- I think the definition of 'non-host reads' needs to be clearly made and used consistently across the document. At the end of the paragraph 'Comprehensive and quantitative analysis of non-host sequences detected in single mosquitoes' the concept of "...13 million non-host reads..." is introduced. At first glance of supplemental figure 3 it seems that "non-host reads" could also be defined as the 16.7 aligned reads that are left after putative host sequences are removed. Although it is true that the derivation of 13 million is explained in the figure text of supplemental figure 3, it may be easier for the reader (as it cost me some time) to explain this in the main text. In addition, is the definition of 'non-host reads' (corresponding to 13-million reads) corresponding to "classified non-host reads" in the following excerpt: "For every sample, "classified non-host reads" refer to those reads mapping to contigs that pass the above filtering, Hexapoda exclusion, and decontamination steps. "Non-host reads" refers to the classified non-host reads plus the reads passing host filtering which failed to assemble into contigs or assembled into a contig with only two reads."? This caused some confusion.
- I believe it would be a valuable addition to add a table for the viruses which includes: 1) How it was determined that the complete genome is there, 2) The percentage overlap for those segments that were identified with blast and 3) Which viruses were already known.
- Have the numbers of the caught mosquitoes somewhere written out in the materials and methods.
- Pg2 L1-3: "Metagenomic sequencing..... a single assay." Perhaps a bit early for this statement. Would suggest to place it two paragraphs later before:"Here, we analyzed...."
- Figure S4 is too pixelated to read. Perhaps due to pdf conversion, but please do check before submission.
-
-