Chiron: translating nanopore raw signal directly into nucleotide sequence using deep learning

This article has been Reviewed by the following groups

Read the full article

Abstract

Sequencing by translocating DNA fragments through an array of nanopores is a rapidly maturing technology that offers faster and cheaper sequencing than other approaches. However, accurately deciphering the DNA sequence from the noisy and complex electrical signal is challenging. Here, we report Chiron, the first deep learning model to achieve end-to-end basecalling and directly translate the raw signal to DNA sequence without the error-prone segmentation step. Trained with only a small set of 4,000 reads, we show that our model provides state-of-the-art basecalling accuracy, even on previously unseen species. Chiron achieves basecalling speeds of more than 2,000 bases per second using desktop computer graphics processing units.

Article activity feed

  1. Now published in GigaScience doi: 10.1093/gigascience/giy037

    Haotian Teng 1Institute for Molecular Bioscience, University of Queensland, St Lucia, Brisbane, QLD 4072 Australia Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteFor correspondence: haotian.teng@uq.net.au l.coin@imb.uq.edu.auMinh Duc Cao 1Institute for Molecular Bioscience, University of Queensland, St Lucia, Brisbane, QLD 4072 Australia Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Minh Duc CaoMichael B. Hall 1Institute for Molecular Bioscience, University of Queensland, St Lucia, Brisbane, QLD 4072 Australia Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Michael B. HallTania Duarte 1Institute for Molecular Bioscience, University of Queensland, St Lucia, Brisbane, QLD 4072 Australia Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Tania DuarteSheng Wang 2Department of Human Genetics, University of Chicago, IL 60637, United States Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteLachlan J.M. Coin 1Institute for Molecular Bioscience, University of Queensland, St Lucia, Brisbane, QLD 4072 Australia Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Lachlan J.M. CoinFor correspondence: haotian.teng@uq.net.au l.coin@imb.uq.edu.au

    A version of this preprint has been published in the Open Access journal GigaScience (see paper https://doi.org/10.1093/gigascience/giy037 ), where the paper and peer reviews are published openly under a CC-BY 4.0 license.

    These peer reviews were as follows:

    Reviewer 1: http://dx.doi.org/10.5524/REVIEW.101102 Reviewer 2: http://dx.doi.org/10.5524/REVIEW.101103 Reviewer 3: http://dx.doi.org/10.5524/REVIEW.101104