Multiple paths toward repeated phenotypic evolution in the spiny‐leg adaptive radiation ( Tetragnatha ; Hawai'i)

This article has been Reviewed by the following groups

Read the full article

Listed in

Log in to save this article

Abstract

The repeated evolution of phenotypes provides clear evidence for the role of natural selection in driving evolutionary change. However, the evolutionary origin of repeated phenotypes can be difficult to disentangle as it can arise from a combination of factors such as gene flow, shared ancestral polymorphisms or mutation. Here, we investigate the presence of these evolutionary processes in the Hawaiian spiny‐leg Tetragnatha adaptive radiation, which includes four microhabitat‐specialists or ecomorphs, with different body pigmentation and size (Green, Large Brown, Maroon, and Small Brown). We investigated the evolutionary history of this radiation using 76 newly generated low‐coverage, whole‐genome resequenced samples, along with phylogenetic and population genomic tools. Considering the Green ecomorph as the ancestral state, our results suggest that the Green ecomorph likely re‐evolved once, the Large Brown and Maroon ecomorphs evolved twice and the Small Brown evolved three times. We found that the evolution of the Maroon and Small Brown ecomorphs likely involved ancestral hybridization events, while the Green and Large Brown ecomorphs likely evolved through novel mutations, despite a high rate of incomplete lineage sorting in the dataset. Our findings demonstrate that the repeated evolution of ecomorphs in the Hawaiian spiny‐leg Tetragnatha is influenced by multiple evolutionary processes.

Article activity feed

  1. This dataset was used as input in NgsDist (Vieira et al. 2015), where we specified a block size of 20 SNPs, and 100 bootstrap replicates. We then ran the NJ software fastme v2 (Lefort et al. 2015), merged all the 100 replicates into a final tree with bootstrap support using RAXML while specifying an optimization of branch-length, and specifying GTRCAT (GTR + Optimization of substitution rates + Optimization of site-specific) as the substitution model.

    I have a couple thoughts about this - is there a reason why a combination of fastme and RAXML was used as opposed to full likelihood inference using RAXML or IQ-TREE?

    I ask because both RAxML and IQ-tree include methods (that i would suggest using here) for acquisition bias correction when using SNPs that could improve inference of both topology and branch-lengths.

    Another entirely different strategy to infer a species tree in a computationally efficient way using SNP data under the multispecies coalescent would be to use SVquartets (Chifman and Kubatko 2014). This potentially well suited for your dataset as it can leverage a large number of SNPs effectively, and even robustly handles linked sites. Caveat is that it cannot infer branch-lengths, but this might not be a concern for your use case.

  2. Phyluce allows extraction of UCEs, but because of the missing data due to low coverage and fragmented genomes, we were only able to retrieve 29 UCEs that were present in half of the dataset (16 of the UCEs were present in all individuals; and 23 UCEs present in ≥70 individuals).

    I may have missed it, but I don't believe it is mentioned in the methods what software was used for phylogenetic inference of the recovered UCEs? If my intuition is correct, I would guess IQ-Tree, the same as for mitochondrial gene? It might be worth clarifying here.

    Related, was a single concatenated phylogeny used? Or did you infer gene trees as well? IQ-Tree is capable of easily inferring both (gene trees and concatenated species tree - see here: http://www.iqtree.org/doc/Concordance-Factor). This would also enable to you infer multiple concordance factors which could help to summarize genealogical concordance, and potentially even quantify per-site discordance using your concatenated SNP dataset?

  3. Recent genomic work has shown that co-occurring closely related species belonging to the Green ecomorph do not hybridize, and it has been argued that there may be some overlap in their ecological niches in the early stages of diversification, suggesting a possible avenue for the divergence of ecomorphs through character displacement upon secondary contact (Schluter 2000; Cotoras et al. 2018).

    I'm confused though how coexistence without hybridization of closely related species of the same ecomorph suggests character displacement?

    Maybe this is clarified later (e.g. some other feature of morphology/ecology other than coloration differs within ecomorph), but if so, it should be explained/mentioned earlier in the paragraph.

    For instance, do spiders vary morphologically within and among ecomorphs equally? greater? less than?

    Given that coloration is the only phentype explored here it might be worth providing some additional context to this point - as written it suggests that the paper might later investigate character displacement, though analyses are primarily genetic.

  4. Both these analyses benefited from a tree-backbone and we specified the tree obtained in ngsDist.

    Could you elaborate on why you chose to use this tree versus the UCE likelihood tree? Are results from Dsuite robust to the choice of species tree used?

  5. These species can be grouped into four ecomorphs (Figure 1 A-D), which are linked to the substrate they inhabit (Gillespie 2004): the Large Brown ecomorph is found on tree bark (Figure 1A), the Green ecomorph on leaves (Figure 1B), the Maroon ecomorph on mosses (Figure 1C), and the Small Brown ecomorph on twigs

    What is the extent/degree of within versus among ecomorph morphological variation? Is ecology and coloration the only characteristic that reliably distinguishes them?

  6. measured as R2, was above 0.11

    This is a very specific number to use as a filter. Was this determined based on patterns of LD-decay inferred from the steps above? I might provide some additional explanation as to why this was used as the LD-filter.

  7. There are mitochondrial-nuclear tree discordances based on topology and bootstrap support

    If i'm understanding the description of the UCE tree correctly, would it be correct to say that there is one additional mitonuclear discordance when considering the UCE tree? (The placement of T. anuenue)

  8. We also display K = 15 because that is the number of species in the dataset (Figure 3).

    Oh, I understand now! It might be worth adding this explanation in the methods? I would also suggest including a plot of LogLik for each value of K - it's difficult to evaluate how strong of a difference there is in model supports without it.

    However, it is somewhat surprising that that these data would suggest there are only 2-3 clusters, when there are 15 described species - this seems like quite the discrepancy.

  9. In the nuclear tree,

    Which nuclear tree? I see below that there is some discordance in topologies inferred using the two nuclear datasets (SNPs with fastme, UCEs with IQ-Tree), so it would be worth specifying here.

  10. minimum map quality of 30, minimum base quality of 20

    Just a suggestion, but it might simplify things for you if you had a section prior to any use of ANGSD describing the commonly used parameter settings across these analyses?

  11. Based on Patterson’s D statistics (Supplementary Figure 02) and F4-branch statistics, we found excess allele sharing between ecomorphs and within ecomorphs (Figure 4).

    It might be really interesting to see the distribution of D-statistics pooled into both within vs among species and ecomorph comparisons.

    I say this because it could be really informative to compare/rank the evidence for hybridization across levels - within/among species, as compared to within/among ecomorphs.

  12. he maximum-likelihood UCE tree is topologically concordant with the ngsDist tree at the ecomorph level (Supplementary Figure 01).

    Could you clarify what you mean by this? It's difficult to intuit what this discordance looks like without the supplement being available. Regardless, it's nice to see that the discordance between the two topologies seems modest -

  13. he maximum-likelihood UCE tree is topologically concordant with the ngsDist tree at the ecomorph level (Supplementary Figure 01).

    Could you clarify what you mean by this? It's difficult to intuit what this discordance looks like without the supplement being available. Regardless, it's nice to see that the discordance between the two topologies seems modest -

  14. Based on Patterson’s D statistics (Supplementary Figure 02) and F4-branch statistics, we found excess allele sharing between ecomorphs and within ecomorphs (Figure 4).

    It might be really interesting to see the distribution of D-statistics pooled into both within vs among species and ecomorph comparisons.

    I say this because it could be really informative to compare/rank the evidence for hybridization across levels - within/among species, as compared to within/among ecomorphs.

  15. We also display K = 15 because that is the number of species in the dataset (Figure 3).

    Oh, I understand now! It might be worth adding this explanation in the methods? I would also suggest including a plot of LogLik for each value of K - it's difficult to evaluate how strong of a difference there is in model supports without it.

    However, it is somewhat surprising that that these data would suggest there are only 2-3 clusters, when there are 15 described species - this seems like quite the discrepancy.

  16. There are mitochondrial-nuclear tree discordances based on topology and bootstrap support

    If i'm understanding the description of the UCE tree correctly, would it be correct to say that there is one additional mitonuclear discordance when considering the UCE tree? (The placement of T. anuenue)

  17. In the nuclear tree,

    Which nuclear tree? I see below that there is some discordance in topologies inferred using the two nuclear datasets (SNPs with fastme, UCEs with IQ-Tree), so it would be worth specifying here.

  18. minimum map quality of 30, minimum base quality of 20

    Just a suggestion, but it might simplify things for you if you had a section prior to any use of ANGSD describing the commonly used parameter settings across these analyses?

  19. Both these analyses benefited from a tree-backbone and we specified the tree obtained in ngsDist.

    Could you elaborate on why you chose to use this tree versus the UCE likelihood tree? Are results from Dsuite robust to the choice of species tree used?

  20. measured as R2, was above 0.11

    This is a very specific number to use as a filter. Was this determined based on patterns of LD-decay inferred from the steps above? I might provide some additional explanation as to why this was used as the LD-filter.

  21. Phyluce allows extraction of UCEs, but because of the missing data due to low coverage and fragmented genomes, we were only able to retrieve 29 UCEs that were present in half of the dataset (16 of the UCEs were present in all individuals; and 23 UCEs present in ≥70 individuals).

    I may have missed it, but I don't believe it is mentioned in the methods what software was used for phylogenetic inference of the recovered UCEs? If my intuition is correct, I would guess IQ-Tree, the same as for mitochondrial gene? It might be worth clarifying here.

    Related, was a single concatenated phylogeny used? Or did you infer gene trees as well? IQ-Tree is capable of easily inferring both (gene trees and concatenated species tree - see here: http://www.iqtree.org/doc/Concordance-Factor). This would also enable to you infer multiple concordance factors which could help to summarize genealogical concordance, and potentially even quantify per-site discordance using your concatenated SNP dataset?

  22. This dataset was used as input in NgsDist (Vieira et al. 2015), where we specified a block size of 20 SNPs, and 100 bootstrap replicates. We then ran the NJ software fastme v2 (Lefort et al. 2015), merged all the 100 replicates into a final tree with bootstrap support using RAXML while specifying an optimization of branch-length, and specifying GTRCAT (GTR + Optimization of substitution rates + Optimization of site-specific) as the substitution model.

    I have a couple thoughts about this - is there a reason why a combination of fastme and RAXML was used as opposed to full likelihood inference using RAXML or IQ-TREE?

    I ask because both RAxML and IQ-tree include methods (that i would suggest using here) for acquisition bias correction when using SNPs that could improve inference of both topology and branch-lengths.

    Another entirely different strategy to infer a species tree in a computationally efficient way using SNP data under the multispecies coalescent would be to use SVquartets (Chifman and Kubatko 2014). This potentially well suited for your dataset as it can leverage a large number of SNPs effectively, and even robustly handles linked sites. Caveat is that it cannot infer branch-lengths, but this might not be a concern for your use case.