Migration without interbreeding: Evolutionary history of a highly selfing Mediterranean grass inferred from whole genomes

Curation statements for this article:
  • Curated by eLife

    eLife logo

    Summary: This paper has several strengths. It addresses Brachypodium distachyon population genetics and demography to help understand phenomena that have been investigated in less data-rich papers before. The authors do so with whole-genome sequencing of both a pre-existing global collection and additional "gap-filling" sampling. Analyses have been conducted using best practices, and most of the conclusions reflect the data and analyses presented.

    Major findings include the existence of large-scale population structure with three distinct lineages, discordance between geographical occurrence and genetic relatedness (clades within the lineages), and at shorter geographic scales, signs of dispersal without interbreeding. These patterns are explained by a combination of near-complete selfing and seed dispersal.

    The work attempts to cover a lot of ground, including selfing, seed dispersal, coalescence theory, microevolution, plasticity and frequency dependent selection, all mentioned in the abstract. The presentation would probably benefit from focusing on one or two aspects and making a stronger case for them.

    The reviewers noted that studies of this kind will often be descriptive due to the largely untestable nature of complex hypotheses of historical dispersal and evolution. Direct empirical testing of some of the hypotheses put forward here would require substantial experimental work (e.g. measuring the fitness of artificial hybrids to demonstrate post-zygotic reproductive isolation). As a first pass, simulations would likely suffice to test whether processes such as drift, selfing, and founder effects are sufficient to explain the population structure, or whether more complex processes such as frequency-dependent selection or reproductive isolation need to be invoked.

This article has been Reviewed by the following groups

Read the full article

Abstract

Wild plant populations show extensive genetic subdivision and are far from the ideal of panmixia which permeates population genetic theory. Understanding the spatial and temporal scale of population structure is therefore fundamental for empirical population genetics – and of interest in itself, as it yields insights into the history and biology of a species. In this study we extend the genomic resources for the wild Mediterranean grass Brachypodium distachyon to investigate the scale of population structure and its underlying history at whole‐genome resolution. A total of 86 accessions were sampled at local and regional scales in Italy and France, which closes a conspicuous gap in the collection for this model organism. The analysis of 196 accessions, spanning the Mediterranean from Spain to Iraq, suggests that the interplay of high selfing and seed dispersal rates has shaped genetic structure in B . distachyon . At the continental scale, the evolution in B . distachyon is characterized by the independent expansion of three lineages during the Upper Pleistocene. Today, these lineages may occur on the same meadow yet do not interbreed. At the regional scale, dispersal and selfing interact and maintain high genotypic diversity, thus challenging the textbook notion that selfing in finite populations implies reduced diversity. Our study extends the population genomic resources for B . distachyon and suggests that an important use of this wild plant model is to investigate how selfing and dispersal, two processes typically studied separately, interact in colonizing plant species.

Article activity feed

  1. Reviewer #3:

    Whole genome sequence data from a geographically large set of 86 Brachypodium distachyon samples is presented and combined with previous data. In addition, flowering time collected from both field and controlled conditions are presented. Overall, the manuscript has many interesting aspects and ideas but overall, the main agenda is not clear. They mention selfing, seed dispersal, coalescence theory, microevolution, plasticity and frequency dependent selection in the abstract but none of those topics are explored in-depth in the manuscript. There were multiple points e.g. in the methods that needed clarification. The manuscript would benefit from focusing on one or two aspects and making strong cases for them.

    Main comments:

    1. It is an overstatement to claim that this dataset covers the region from Iberia to Iraq, when already previous datasets covered Iberia and Iraq. Here French and Italian samples are added to previous data.

    2. The connection between the heterozygosity, structural variation and assembly issues due to paralogy should be more clearly presented. For example, in r. 130-134,it is not obvious what does mapping against BdTR7a to itself and identifying less heterozygous sites prove? In addition, the procedure for masking the fake heterozygosity should be more explicitly described. Inspection by IGB, or defining thresholds by "trial and error" are not reproducible methods. Also, wouldn't one want to take into account the overall level of diversity in a given region instead of putting a threshold as "ten or more SNPs along a distance of at least 300 bp".

    3. Sympatry issue: The different lineages are described to be sympatric thus it would be important to be really specific about the sampling locations. How close are the closest sympatric samples representing different lineages? Is that truly a sympatric setting? Further in r. 176-181, how does plotting ancestry components in the map prove that there has not been gene flow between sympatric lineages? There seems to be shared ancestry but it is a known issue that shared ancestry and admixture are not easy to separate. This aspect is central to the paper and would need more rigorous analysis with e.g. forward or coalescence simulations. The reasoning continues in rows 344-352, but is not really backed up by any analysis other than plotting ancestry components on the map. Or if it is, it should be more precisely expressed.

    4. R. 301-303 this statement sounds like the authors are suggesting that selfing and dispersal are actively (or as a result of selection) interacting and maintaining the diversity. I did not see convincing evidence that the distribution of lineages is not just a combination of drift, selfing and random dispersal events. Maybe this is what the authors mean, but should be more clearly stated.

  2. Reviewer #2:

    Generally, this paper is excellent. It explores many characteristics of Brachypodium distachyon population genetics and demography, many of which have been assumed or hypothesised by less data-rich papers over the last two decades. The authors do so with whole-genome sequencing of both a pre-existing global collection and some novel "gap-filling" sampling. The authors appear to have conducted all analyses using best practices, and the conclusions are largely not over-interpreted. I have only a few minor comments.

    L68: Ideally a more detailed summary of the work summarised in Supp File 1 would be brought into the main manuscript. The introduction in and of itself largely skims over the quite large amount that is already known or assumed about the population genomics and dynamics of B. distachyon, especially the ~4 other recent WGS popgen papers which cover adjacent/overlapping collections and topics to this manuscript.

    L165: with regards random sequence subsets for BPP: does this include sequence only from genes, or from intergenic space? what about TE or other repeat loci? How do you ensure subset regions are single-copy orthologs in all accessions? I'm no expert on BPP, but I'm largely aware of BPP being used on exon capture data (i.e. genic sequence and flanking introns), admitted at different evolutionary scales with a greater expectation that assumptions of orthology are not met.

    L338: the speculation about heterozygosity being induced "in the lab" is very interesting. If you have the data which allows investigating this, could you test if the maternal/paternal haplotypes in heterozygous regions match implausibly distant accessions, suggesting in-lab outcrossing?

    L364-365: wouldn't a decrease in diversity as one moves east imply an eastwards migration? I'm not sure if I'm misreading this sentence or there is a typo which switches the direction of the decrease. In any case perhaps reword this sentence for clarity.

    L403: typo: distance is week -> distance is weak

    L405: typo: descent -> descend. Also, a suggestion: did not descend from a single recent colonization (add "single")

    L410: Seed dispersal then ensures OR "would then ensure" (delete would, or ensures -> ensure).

    L421: While human-commensal seed dispersal likely explains most recent migration, surely the estimated branch times (fig 5) predate significant human movement? Or, phrased alternatively, were there other/additional historical agents of migration?

    L433: are pathogens not a potentially strong selective pressure on (nearly) all plants? How then do pathogens relate uniquely to the reproductive strategy/population structure and dynamics of B. distachyon?

    L435: Is a concluding paragraph required? I feel the discussion ends somewhat abruptly.

    L539: (optional suggestion): given the non-linearity in the IBD plots you present, it would be interesting to apply Generalised Dissimilarity Modelling to test for/examine IBD.

    L567: Please give light measurements in uE PAR (umol photos /m2/sec; 400-700nm) in addition to/instead of klux.

  3. Reviewer #1:

    The manuscript describes analyses of genomic data to study the population structure and demographic history of Brachypodium distachyon - a selfing Mediterranean grass species. Major findings include the existence of large-scale population structure (3 lineages), discordance between geographical occurrence and genetic relatedness (clades within the lineages), and at shorter scales, signs of dispersal without interbreeding. These patterns are explained by a combination of near-complete selfing and seed dispersal. The methods are appropriate, results well reported, and writing is good. As such, the paper provides interesting insights into the evolutionary history of B. distachyon, but due to its descriptive nature, I somewhat question the paper's value for a wider audience (i.e. people not directly working with B. distachyon). At points, the authors also engage in speculation (not supported by data) where I feel that more simpler population genetic processes are ignored.

    In my opinion, the biggest weakness is the descriptive nature of the paper: it describes the genetic structure and demographic history of B. distachyon, but potential processes giving rise to the structure are only speculated. In particular, the authors invoke pre- and post-zygotic reproductive isolation (lines 384 - 387) and pathogen-driven frequency-dependent selection (lines 431 - 435) as potential causes for the observed structure. However, as the paper provides no evidence for such processes, it's not clear to me why they need to be invoked in the first place? Evidence for seed dispersal over relatively short spatial scales is shown (within populations in Italy, Fig 4), but to my reading the results suggest little dispersal/gene flow over long distances (only few individuals with increased heterozygosity or signs of admixture). Therefore, I believe that the simplest explanation for the genetic structure is founder effects (perhaps human-induces, given the peculiar differences within the A and B lineages) combined with the near-complete selfing. This would explain the emergence of the genetic lineages and the lack of interbreeding. Furthermore, I would imagine that the genetic groups are locally adapted (e.g. there's extensive local adaptation among the selfing populations of A. thaliana), which would ensure that one lineage/accession doesn't take over when otherwise feasible (e.g. within the B lineage). If the authors argue otherwise, I would like to see more convincing evidence and/or discussion supporting the invoked processes.

    Below I list a few more specific comments:

    Lines 26 - 27: "[our study] identifies adaptive phenotypic plasticity and frequency-dependent selection as key themes to be addressed with this model system". While reading the abstract this sentence got me interested and I expected at least some analyses addressing these topics. However, the only place where they are mentioned again are two highly speculative sentences at the end of the discussion (lines 427 - 435). Although the authors write "themes to be addressed", I think that the complete lack of evidence for adaptive plasticity or pathogen-driven frequency-dependent selection in the current study makes this sentence too misleading to be left in the abstract.

    Lines 51 - 53: "For plants, genome-wide coalescence approaches have therefore been largely restricted to domesticated species and Arabidopsis thaliana". This might have been true some years ago, but not anymore. Just to highlight a few wild plant species (and studies) where demographic history has been studied using whole-genome data: A. lyrata (Mattila et al. 2017 MBE), A. arenosa (Monnahan et al. 2019 Nat Ecol Evol), Capsella genus (Douglas et al. 2015 PNAS, Koenig et al. 2019 eLife), Boechera stricta (Wang et al. 2019 Genome Biol), Populus genus (Wang et al. 2016 MBE, Hou & Li 2020 Front Plant Sci), Coclearia genus (Bray et al. 2020 bioRxiv), and many more.

    Lines 383 - 387: "Flowering time differences are at best part of an explanation for genetic structure. In the scenario of subsequent lineage expansions we propose here, reproductive isolation might have evolved when the lineages were geographically isolated; and it might include other pre- and post-zygotic barriers in addition to flowering time, namely niche differentiation or genomic incompatibilities". These sentences kind of come out of nowhere. First, I don't fully understand the distinction between genetic structure and lineage expansions. If the latter is a process beyond population structure (i.e. incipient speciation), the paper shows no evidence of that. In fact, as I outlined above, I would imagine that founder effects and near-complete selfing is enough to cause and maintain population differentiation without reproductive isolation?

    Lines 389 - 390: "Furthermore, differences observed in the greenhouse are most likely exaggerated through artificially short vernalization times. As our outdoors experiment shows, all accessions produced flowers within two weeks when they went through prolonged vernalization during winter". How representative are these vernalization times of the natural growing conditions? Large differences were observed in the greenhouse experiment, but the authors argue that these are not meaningful because the outdoor experiment showed little differences. However, a single experiment conducted in Zurich certainly does not capture environmental variation existing across the Mediterranean, so I'm not convinced that the role of flowering time can be ruled out so strongly based on these results. That said, the near-complete selfing suggests to me that flowering time is likely not a major factor underlying the genetic structure, and founder effects are a better explanation for it.

    Line 548: Only one species (B. stacei) was used to define ancestral alleles in the fastsimcoal2 analysis. There are multiple studies showing that the use of a single outgroup, especially based on parsimony, leads to unreliable inferences of ancestral and derived alleles (e.g. Keightley et al. 2016 Genetics, Keightley & Jackson 2018 Genetics). In particular, this leads to overestimation of high-frequency derived variants, distorting the shape of the unfolded SFS. As the observed SFS has more shared high-frequency variants than predicted by the demography model (Fig S5), I imagine that this is an issue. FSC2 also works with the folded SFS, so I wonder why the authors chose to use the unfolded SFS? Unless there is a compelling reason, I suggest to either add more outgroups or to simply fold the SFS.

  4. Summary: This paper has several strengths. It addresses Brachypodium distachyon population genetics and demography to help understand phenomena that have been investigated in less data-rich papers before. The authors do so with whole-genome sequencing of both a pre-existing global collection and additional "gap-filling" sampling. Analyses have been conducted using best practices, and most of the conclusions reflect the data and analyses presented.

    Major findings include the existence of large-scale population structure with three distinct lineages, discordance between geographical occurrence and genetic relatedness (clades within the lineages), and at shorter geographic scales, signs of dispersal without interbreeding. These patterns are explained by a combination of near-complete selfing and seed dispersal.

    The work attempts to cover a lot of ground, including selfing, seed dispersal, coalescence theory, microevolution, plasticity and frequency dependent selection, all mentioned in the abstract. The presentation would probably benefit from focusing on one or two aspects and making a stronger case for them.

    The reviewers noted that studies of this kind will often be descriptive due to the largely untestable nature of complex hypotheses of historical dispersal and evolution. Direct empirical testing of some of the hypotheses put forward here would require substantial experimental work (e.g. measuring the fitness of artificial hybrids to demonstrate post-zygotic reproductive isolation). As a first pass, simulations would likely suffice to test whether processes such as drift, selfing, and founder effects are sufficient to explain the population structure, or whether more complex processes such as frequency-dependent selection or reproductive isolation need to be invoked.