Faroese Whole Genomes Provide Insight into Ancestry and Recent Selection
Curation statements for this article:-
Curated by eLife
eLife Assessment
This paper presents an analysis of demography and selection from whole-genome sequencing of 40 Faroese, with data that are useful beyond the study region. Much of the analysis is solid, but a more fine-scale analysis of demographic history could have led to more interesting findings. In addition, there are concerns about the selection analyses, given the special nature of the studied population and sampling scheme. Finally, lack of data availability limits the broader value of the paper.
This article has been Reviewed by the following groups
Listed in
- Evaluated articles (eLife)
Abstract
Abstract
The Faroe Islands are home to descendants of a North Atlantic founder population with a unique history shaped by both migration and periods of relative isolation. Here, we investigate the genetic diversity, population structure, and demographic history of the islands by analyzing whole genome sequencing data from 40 participants in the Faroe Genome Project. This represents the first whole genome sequencing panel of this size from the Faroe Islands. We observed numerous putatively functional private alleles, including stop gain variants and high impact missense variants in the cohort. Faroese individuals had a higher proportion of their genomes contained in long runs of homozygosity than other European groups, including Finnish, suggesting a more recent or stronger bottleneck in the Faroese population. Signals of positive selection were identified at loci containing genes that play roles in vitamin D and dietary fat absorption and DNA repair, while increased diversity on lactase persistence haplotypes was observed. Fine-scale analysis of haplotype structure in modern and ancient European genomes revealed genetic affinities with ancient Iron Age individuals from the North and West of Europe, providing evidence for potential contributions to the Faroese gene pool from Celtic and Viking populations as well as information about the temporal order in which these events happened. This study highlights the impact of evolutionary processes, such as ancient admixture, founder events, and positive selection, on the present-day genetic architecture of North Atlantic founder populations like the Faroe Islands.
Article activity feed
-
-
-
eLife Assessment
This paper presents an analysis of demography and selection from whole-genome sequencing of 40 Faroese, with data that are useful beyond the study region. Much of the analysis is solid, but a more fine-scale analysis of demographic history could have led to more interesting findings. In addition, there are concerns about the selection analyses, given the special nature of the studied population and sampling scheme. Finally, lack of data availability limits the broader value of the paper.
-
Reviewer #1 (Public review):
Summary:
The paper reports an analysis of whole-genome sequence data from 40 Faroese. The authors investigate aspects of demographic history and natural selection in this population. The key findings are that the Faroese (as expected) have a small population size and are broadly of Northwest European ancestry. Accordingly, selection signatures are largely shared with other Northwest European populations, although the authors identify signals that may be specific to the Faroes. Finally, they identify a few predicted deleterious coding variants that may be enriched in the Faroes.
Strengths:
The data are appropriately quality-controlled and appear to be of high quality. Some aspects of the Faroese population history are characterized, in particular, by the relatively (compared to other European populations) …
Reviewer #1 (Public review):
Summary:
The paper reports an analysis of whole-genome sequence data from 40 Faroese. The authors investigate aspects of demographic history and natural selection in this population. The key findings are that the Faroese (as expected) have a small population size and are broadly of Northwest European ancestry. Accordingly, selection signatures are largely shared with other Northwest European populations, although the authors identify signals that may be specific to the Faroes. Finally, they identify a few predicted deleterious coding variants that may be enriched in the Faroes.
Strengths:
The data are appropriately quality-controlled and appear to be of high quality. Some aspects of the Faroese population history are characterized, in particular, by the relatively (compared to other European populations) high proportion of long runs of homozygosity, which may be relevant for disease mapping of recessive variants. The selection analysis is presented reasonably, although as the authors point out, many aspects, for example differences in iHS, can reflect differences in demographic history or population-specific drift and thus can't reliably be interpreted in terms of differences in the strength of selection.
Weaknesses:
The main limitations of the paper are as follows:
(1) The data are not available. I appreciate that (even de-identified) genotype data cannot be shared; however, that does substantially reduce the value of the paper. Minimally, I think the authors should share summary statistics for the selection scans, in line with the standard of the field.
(2) The insight into the population history of the Faroes is limited, relative to what is already known (i.e., they were settled around 1200 years ago, by people with a mixture of Scandinavian and British ancestry, have a small effective population size, and any admixture since then comes from substantially similar populations). It's obvious, for example, that the Faroese population has a smaller bottleneck than, say, GBR.
More sophisticated analyses (for example, ARG-based methods, or IBD or rare variant sharing) would be able to reveal more detailed and fine-scale information about the history of the populations that is not already known. PCA, ADMIXTURE, and HaplotNet analysis are broad summaries, but the interesting questions here would be more specific to the Faroes, for example, what are the proportions of Scandinavian vs Celtic ancestry? What is the date and extent of sex bias (as suggested by the uniparental data) in this admixture? I think that it is a bit of a missed opportunity not to address these questions.
(3) I don't really understand the rationale for looking at HLA-B allele frequencies. The authors write that "ankylosing spondylitis (AS) may be at a higher prevalence in the Faroe Islands (unpublished data), however, this has not been confirmed by follow-up epidemiological studies". So there's no evidence (certainly no published evidence) that AS is more prevalent, and hence nothing to explain with the HLA allele frequencies?
-
Reviewer #2 (Public review):
In this paper, Hamid et al present 40 genomes from the Faroe Islands. They use these data (a pilot study for an anticipated larger-scale sequencing effort) to discuss the population genetic diversity and history of the sample, and the Faroes population. I think this is an overall solid paper; it is overall well-polished and well-written. It is somewhat descriptive (as might be expected for an explorative pilot study), but does make good use of the data.
The data processing and annotation follows a state-of-the-art protocol, and at least I could not find any evidence in the results that would pinpoint towards bioinformatic issues having substantially biased some of the results, and at least preliminary results lead to the identification of some candidate disease alleles, showing that small, isolated cohorts …
Reviewer #2 (Public review):
In this paper, Hamid et al present 40 genomes from the Faroe Islands. They use these data (a pilot study for an anticipated larger-scale sequencing effort) to discuss the population genetic diversity and history of the sample, and the Faroes population. I think this is an overall solid paper; it is overall well-polished and well-written. It is somewhat descriptive (as might be expected for an explorative pilot study), but does make good use of the data.
The data processing and annotation follows a state-of-the-art protocol, and at least I could not find any evidence in the results that would pinpoint towards bioinformatic issues having substantially biased some of the results, and at least preliminary results lead to the identification of some candidate disease alleles, showing that small, isolated cohorts can be an efficient way to find populations with locally common, but globally rare disease alleles.
I also enjoyed the population structure analysis in the context of ancient samples, which gives some context to the genetic ancestry of Faroese, although it would have been nice if that could have been quantified, and it is unfortunate that the sampling scheme effectively precludes within-Faroes analyses.
I am unfortunately quite critical of the selection analysis, both on a statistical level and, more importantly, I do not believe it measures what the authors think it does.
Major comments:
(1) Admixture timing/genomic scaling/localization:
As the authors lay out, the Faroes were likely colonized in the last 1,000-1,500 years, i.e., 40-60 generations ago. That means most genomic processes that have happened on the Faroese should have signatures that are on the order of ~1-2cM, whereas more local patterns likely indicate genetic history predating the colonization of the islands. Yet, the paper seems to be oblivious to this (to me) fascinating and somewhat unique premise. Maybe this thought is wrong, but I think the authors miss a chance here to explain why the reader should care beyond the fact that the small populations might have high-frequency risk alleles and the Faroes are intrinsically interesting, but more importantly, it also makes me think it leads to some misinterpretations in the selection analysis(2) ROH:
Would the sampling scheme impact ROH? How would it deal with individuals with known parental coancestry? As an example of what I mean by my previous comment, 1MB is short enough in that I would expect most/many 1MB ROH-tracts to come from pedigree loops predating the colonization of the Faroes. (i.e, I am actually quite surprised that there isn't much more long ROH, which makes me wonder if that would be impacted by the sampling scheme).(3) Selection scan:
We are talking about a bottlenecked population that is recently admixed (Faroese), compared to a population (GBR) putatively more closely related to one of its sources. My guess would be that selection in such a scenario would be possibly very hard to detect, and even then, selection signals might not differentiate selection in Faroese vs. GBR, but rather selection/allele frequency differences between different source populations. I think it would be good to spell out why XP-EHH/iHS measures selection at the correct time scale, and how/if these statistics are expected to behave differently in an admixed population.
(4) Similarly, for the discussion of LCT, I am not convinced that the haplotypes depicted here are on the right scale to reflect processes happening on the Faroes. Given the admixture/population history, it at the very least should be discussed in the context of whether the 13910 allele frequency on the Faroes is at odds with what would be expected based on the admixture sources.
(5) I am lacking information to evaluate the procedure for turning the outliers into p-values. Both iHS and XP-EHH are ratio statistics, meaning they might be heavy-tailed if one is not careful, and the central limit theorem may not apply. It would be much easier (and probably sufficient for the points being made here) to reframe this analysis in terms of empirical outliers.
(6) Oldest individual predating gene flow: It seems impossible to make any statements based on a single individual. Why is it implausible that this person (or their parents), e.g., moved to the Faroes within their lifetime and died there?
-