Calculating and interpreting F _ST in the genomics era

Menno J. de Jong
Cock van Oosterhout
A. Rus Hoelzel
Axel Janke

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

The relative genetic distance between populations is commonly measured using the fixation index ( F _ST ). Traditionally inferred from allele frequency differences, the question arises how F _ST can be estimated and interpreted when analysing genomic datasets with low sample sizes. Here, we advocate an elegant solution first put forward by Hudson et al. (1992): F _ST = ( D _xy – π _xy )/ D _xy , where D _xy and π _xy denote mean sequence dissimilarity between and within populations, respectively. This multi-locus F _ST -metric can be derived from allele frequency data, but also from sequence alignment data alone, even when sample sizes are low and/or unequal. As with other F _ST -metrices, the numerator denotes net divergence ( D _a ), which is equivalent to the f ² -statistic and Nei’s D (for realistic estimates of D _xy and π _xy ). In terms of demographic inference, net divergence measures the difference in increase of D _xy and π _xy since the population split, owing to a reduction of coalescence times within populations as a result of genetic drift. Because different combinations of ΔD _xy and Δπ _xy can produce identical F _ST -estimates, no universal relationship exists between F _ST and population split time. Still, in case of recent population splits, when novel mutations are negligible, F _ST -estimates can be accurately converted into coalescent units ( τ . i.e., split time in multiples of 2 N _e ). This then allows to quantify gene tree discordance, without the need for multispecies coalescent based analyses, using the formula: P _discordance = ⅔·(1 – F _ST ). To facilitate the use of the Hudson F _ST -metric, we implemented new utilities in the R package SambaR.

Version published to 10.1101/2024.09.24.614506 on bioRxiv
Sep 25, 2024

Genetic estimates of relatedness: Established practices and new opportunities through low coverage whole genome sequencing

This article has 8 authors:
1. Annika Freudiger
2. Natalie Kestel
3. Vladimir Jovanovic
4. Mariana Madruga de Brito
5. Angelina Ruiz-Lambides
6. Katja Nowick
7. Anja Widdig
8. Harald Ringbauer
This article has no evaluationsLatest version Jan 23, 2026
Testing the validity and adequacy of linguistic phylogenetic analyses

This article has 1 author:
1. Benedict King
This article has no evaluationsLatest version Dec 17, 2025
Introgression facilitates rapid evolution of Galápagos tree finches

This article has 8 authors:
1. Matteo Sebastianelli Arbelaez
2. Erik Enbody
3. Carl-Johan Rubin
4. Carlos Valle
5. Lukas Keller
6. Rosemary Grant
7. Peter Grant
8. Leif Andersson
This article has no evaluationsLatest version Jan 21, 2026

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Genetic estimates of relatedness: Established practices and new opportunities through low coverage whole genome sequencing

Testing the validity and adequacy of linguistic phylogenetic analyses

Introgression facilitates rapid evolution of Galápagos tree finches