Transcriptomic signatures of disease tolerance and environmental persistence in a mosquito-microsporidian system

This article has been Reviewed by the following groups

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

Log in to save this article

Abstract

Parasite life histories are expected to shape virulence and transmission, but the host programmes involved are largely unknown. Experimental evolution and RNA-seq were combined in a mosquito-nicrosporidian system, Anopheles gambia e- Vavraia culicis, to test how evolved parasite strategies alter host gene expression. A mosquito population was infected with parasite lineages selected for early or late transmission and compared with infections by an unselected reference parasite and with uninfected controls. Whole-body transcriptomes were all sampled at a common sporulating stage of infection. A core infection signature was identified by contrasting reference infections with uninfected mosquitoes. Against this baseline, early and late parasites generated distinct, partly overlapping expression profiles. Network analysis revealed modules whose activity tracked infection background and estimates of virulence and parasite load. One module was negatively associated with virulence and enriched for genes involved in damage limitation and tissue integrity, consistent with a disease tolerance programme. A second was positively associated with environmental persistence and enriched for genes that condition the replication and release environment, indicating that host pathways can influence the robustness of transmission stages. These findings link parasite life history, host tolerance and environmental persistence and highlight molecular targets for eco-evolutionary studies of vector-borne disease.

Article activity feed

    1. Mosquito response to selected lines of V. culicis

    I'm curious whether the pangenome differs between the two strains of V. culicis. Since there are only a few hundred differentially expressed genes, I'm curious whether the difference can be pinpointed to ex. a lost gene or a different regulatory SNP or something that then controls downstream expression of those genes. It might be hard to do a pangenome analysis if both strains haven't been sequenced, but you may be able to use your data to get an idea of which transcription factors may contribute to differences.

  1. Compared to uninfected controls, the 20 most up- and down-regulated transcripts were then focused on

    Focused on for what? I don't think I see where this focus is applied other than table 1.

  2. For down-regulated genes, 81 genes were suppressed by early-selected parasites. In comparison, 71 genes were down-regulated by late-selected parasites, with 28 genes commonly down-regulated in both treatments (Fig. 2a). For up-regulated genes, 113 genes were up-regulated in early-selected parasites, compared to 52 in late-selected parasites, with 77 genes commonly up-regulated in both groups (Fig. 2b). These findings suggest that a higher number of DE genes are driven by early-selected parasites compared to late-selected ones, at least during the 10th day of host adulthood. However, a notable overlap exists between both treatments (Fig. 2ab).

    This section is a little hard to keep track of. I've had success using upset plots as an alternative to venn diagrams in the past and find they help with interpretation.

  3. Out of the 9,900 annotated genes, 863 (8.7%) were significantly down-regulated, 1104 (11.1%) were significantly up-regulated, and 7953 (80.2%) showed no differential expression (DE).

    Did you use just a p value cut off or filter on log2 fold change as well?

  4. Across the twelve samples, 20 to 38 million reads were obtained

    Per sample or all samples combined? If all samples combined, that is very shallow sequencing for dual species RNA-seq

  5. Genes with low counts were filtered out according to the rule of 3 count(s) per million (cpm) in at least 4 sample(s). Library sizes were scaled using TMM normalization and log-transformed into counts per million or CPM (EdgeR package version 3.42.4) [44]. Statistical quality controls were performed through density gene expression distribution, clustering, and sample PCA (SI Fig. S2). Differential expression was computed with limma-trend approach [46] by fitting all samples into one linear model. Next, the following group of comparisons were performed:▪ Infected with unselected reference vs. uninfected (1 comparison).▪ IInfected with selected parasite vs. infected with unselected reference (10 comparisons).A moderated t-test was used for each contrast. The Benjamini-Hochberg method computed an adjusted p-value for each comparison, controlling for false discovery rate (FDR, Adj. p). Then, a host gene ontology (GO, biological processes) enrichment analysis was carried out on differentially expressed (DE) genes through Panther 19.0 and Embedded Image vectorbase.org” using the A. gambiae genome as a reference (Fig. 3 and 4).

    Would you be willing to add which functions you used for each of these steps?

  6. “Glimma” [38], “cowplot” [39], “ggrepel” [40], “RColorBrewer” [41], “maptools” [42], “sp” [43], “edgeR” [44], “MASS” [45], “limma” [46], “knitr” [47], “ggplot2” [48] and “dplyr” [49].

    Would you be willing to add versions for these tools?

  7. The poly-A tails and adapters were removed from single-end reads, and quality was trimmed with Cutadapt version 4.8 [34]. Reads matching ribosomal RNA sequences were removed with fastq_screen version 0.11.1 [35]. Reads longer than 40nt were aligned against the concatenated genomes of Anopheles gambiae (NCBI: GCA_943734735.2) and Vavraia culicis floridensis (NCBI: GCA_000192795.1) using STAR version 2.7.10a [36]. The number of read counts per gene locus was summarized with htseq-count version 0.11.2 [37] using gene annotation.

    I think these sentence might be out of order. I expected to read them after reading about the type of sequencing.

    Additionally, I'm wondering if you thought to look for differential isoform splicing in your samples. It might be from genome reduction that there aren't many introns to detect, but I'm curious if this strategy is used in these different biological conditions. Splicing detection may be hard with single end reads, but since you have a reference genome it may be possible.