How precise are mutation rate estimates? Comparison of different approaches to estimate de novo mutation rates

Xi Wang
Chaowei Zhang
Hongbo Wang
Kerry Reid
Juha Merilä

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Availability of de novo mutation rate ( µ ) estimates based on approaches that rely on bioinformatic validations has increased tremendously during the past few years, but the accuracy and precision of these estimates often remain unclear as Sanger sequencing validation of the mutations is often lacking. We used both long- and short-read sequencing data and different bioinformatic pipelines to estimate µ , as well as false positive (FPR) and negative (FNR) rates, for family trios of flat-headed loaches (Oreonectes platycephalus). By comparing estimates against PCR-verified mutations, we observed that the top-performing approach (as ranked by the F1 score of seven approaches at the same depth) still exhibited a 4% false positive rate (FPR) alongside a 12% false negative rate (FNR). Across the remaining methods, FPR values ranged from 4–12%, and FNRs from 12–19%. Irrespective of the bioinformatic approach used, long-read data yielded consistently lower µ estimates than short-read data because of the larger callable genome sizes. In addition, a higher mapping depth resulted in a lower FNR. These results call for caution regarding de novo mutations without Sanger sequencing validation in non-model organisms and raise the possibility that many published µ -estimates, especially those based on low mapping depths, might be biased.

Version published to 10.21203/rs.3.rs-8867294/v1 on Research Square
Mar 6, 2026

Complete Simulation of timsTOF PASEF Raw Datasets with Timsim Enables Precise Evaluation of False Discovery and Phosphosite Localization Error Rates

This article has 11 authors:
1. Stefan Tenzer
2. David Teschner
3. Zixuan Xiao
4. Tim Maier
5. David Gomez-Zepeda
6. Mateusz Łącki
7. Michal Startek
8. Ute Distler
9. Tanja Ziesmann
10. Mathias Wilhelm
11. Andreas Hildebrand
This article has no evaluationsLatest version Mar 12, 2026
MobiDeep: an AI-based meta-score for scoring non-coding DNA variations

This article has 18 authors:
1. Abdelhakim Bouazzaoui
2. Jean-Madeleine de Sainte Agathe
3. Simon Cabello-Aguilar
4. Ophélie Evrard
5. Juliette Nectoux
6. Marina Konyukh
7. Leila Qebibo
8. Thibault Coste
9. Sandrine M. Caputo
10. Perrine Brunelle
11. Yohann Jourdy
12. Cécile Rouzier
13. Mireille Cossée
14. Charles Van Goethem
15. Olivier Ardouin
16. Vasiliki Kalatzis
17. Anne-Françoise Roux
18. David Baux
This article has no evaluationsLatest version Mar 11, 2026
Machine Learning-Based Prediction of Base Editor sgRNA fitness score

This article has 11 authors:
1. Alessandro Orro
2. Arianna Consiglio
3. Maria Ilaria Curci
4. Martina Scichilone
5. Faiza Hasin
6. Michele Minervini
7. Corrado Mencar
8. Gianluca De Bellis
9. Cinzia Cocola
10. Paride Pelucchi
11. Tommaso Selmi
This article has no evaluationsLatest version Apr 10, 2026

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Complete Simulation of timsTOF PASEF Raw Datasets with Timsim Enables Precise Evaluation of False Discovery and Phosphosite Localization Error Rates

MobiDeep: an AI-based meta-score for scoring non-coding DNA variations

Machine Learning-Based Prediction of Base Editor sgRNA fitness score