Direct inference of the distribution of fitness effects of spontaneous mutations from recombinant inbred C. elegans mutation accumulation lines
This article has been Reviewed by the following groups
Listed in
- Evaluated articles (Arcadia Science)
Abstract
The distribution of fitness effects (DFE) of new mutations plays a central role in evolutionary biology. Estimates of the DFE from experimental Mutation Accumulation (MA) lines are compromised by the complete linkage disequilibrium (LD) between mutations in different lines. To reduce LD, we constructed two sets of recombinant inbred lines from a cross of two C. elegans MA lines. One set of lines (“RIAILs”) was intercrossed for ten generations prior to ten generations of selfing; the second set of lines (“RILs”) omitted the intercrossing. Residual LD in the RIAILs is much less than in the RILs, which affects the inferred DFE when the sets of lines are analyzed separately. The best-fit model estimated from all lines (RIAILs + RILs) infers a large fraction of mutations with positive effects (∼40%); models that constrain mutations to have negative effects fit much worse. The conclusion is the same using only the RILs. For the RIAILs, however, models that constrain mutations to have negative effects fit nearly as well as models that allow positive effects. When mutations in high LD are pooled into haplotypes, the inferred DFE becomes increasingly negative-skewed and leptokurtic. We conclude that the conventional wisdom - most mutations have effects near zero, a handful of mutations have effects that are substantially negative and mutations with positive effects are very rare – is likely correct, and that unless it can be shown otherwise, estimates of the DFE that infer a substantial fraction of mutations with positive effects are likely confounded by LD.
Article activity feed
-
Here we employ a classical line-cross strategy with MA lines, to break down the linkage disequilibrium among the accumulated mutations. We then combine whole-genome sequencing with high-throughput competitive fitness assays to estimate the DFE of a set of 169 spontaneous mutations.
I greatly enjoyed reading this paper. True experimental estimates of the DFE in MA studies are super valuable and provide a very interesting comparison for pop-gen based DFE methods as pointed out by the authors.
-
The magnitude of the raw difference is typically much larger than that of the posterior effects. The difference is likely caused by LD, in that the raw difference of a single mutation contains contributions from other linked mutations, which may inflate the estimates.
Could you constrain this analysis to mutations that are in LE with other de-novo mutations to test this hypothesis?
-
Averaged over all RI(AI)Ls, accounting for variation among assay blocks and removing two outlying lines, the regression of W on number of mutations is not significantly different from 0 (slope = −0.0051, F1,509=1.83, P>0.17), although the trend suggests that mutations are deleterious, on average.
Is there a chance that false negative mutations (i.e. incorrectly unobserved events in the MA lines) could contribute to this result?
-
The simplest way to infer the mutational effect at a locus is to calculate the mean value of all lines with a mutant allele and all lines with an ancestral allele at that locus; the difference is the raw difference (uRAW) of the mutation at that locus. As a sanity check, we plotted the inferred Bayesian posterior effect against the raw difference; ideally, the correlation should be +1. The correlations were positive, but well below 1 in all three cases (Figure 4). The magnitude of the raw difference is typically much larger than that of the posterior effects. The difference is likely caused by LD, in that the raw difference of a single mutation contains contributions from other linked mutations, which may inflate the estimates.
Two quick thoughts for further sanity checks. 1) Does this regression look any different for SNPs vs indels? …
The simplest way to infer the mutational effect at a locus is to calculate the mean value of all lines with a mutant allele and all lines with an ancestral allele at that locus; the difference is the raw difference (uRAW) of the mutation at that locus. As a sanity check, we plotted the inferred Bayesian posterior effect against the raw difference; ideally, the correlation should be +1. The correlations were positive, but well below 1 in all three cases (Figure 4). The magnitude of the raw difference is typically much larger than that of the posterior effects. The difference is likely caused by LD, in that the raw difference of a single mutation contains contributions from other linked mutations, which may inflate the estimates.
Two quick thoughts for further sanity checks. 1) Does this regression look any different for SNPs vs indels? 2) Do the individual mutation specific effects conform to expectations one might have based on the functional annotations available for these mutational events?
-
-