Estimating fitness effects of mutations in the presence of genetic linkage

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Probabilistic prognosis of the evolution of population requires the knowledge of the fitness effects of mutations at different genomic sites. However, the signature of natural selection is eclipsed by strong noise in data, because the common ancestors of different sites render their evolution inter-dependent. Together with mutation and recombination, genetic linkage also makes evolution stochastic and requires averaging over many independent populations. Here we develop a method designed to work in the presence of strong linkage effects. For testing, we apply it to simulated genomic sequences generated by a Monte Carlo algorithm with known selection coefficients. Results demonstrate the good accuracy of the estimates of relative selection coefficients, if more than ten independent populations are used for averaging. Infrequent recombination partly compensates linkage effects and improves the accuracy. The method is then used to estimate the selection coefficients of 10,000 genomic sites of E. coli . These findings enable the inference of adaptive landscape under the conditions of strong multi-site linkage.

AUTHOR SUMMARY

Probabilistic prognosis of evolution of an organism requires the knowledge of fitness effects of mutations at different genomic sites. However, genetic linkage between different sites due to their common phylogenetic history obscures the effects of natural selection. Here we develop a linkage-resistant method to infer fitness effects and test its accuracy on mock sequences generated by an evolutionary algorithm with known fitness effects. Results demonstrate fair accuracy when averaging is performed over ten or more independently-evolving populations. After application of this program to genomic data for E. coli , relative selection coefficients for thousands of genomic sites are obtained.

Article activity feed