DFAST_QC: Quality Assessment and Taxonomic Identification Tool for Prokaryotic Genomes

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Motivation

Accurate taxonomic assignments of genomic data are crucial across various biological databases. With a rapid increase in submitted genomes in recent years, ensuring precise classification is important to maintain database integrity. Mislabeled genomes can confuse researchers, hinder analyses, and produce false results. Therefore, there is a critical need for computationally efficient tools that ensure accurate taxonomic classification for data to be deposited into genomic databases.

Results

Here we introduce DFAST_QC, a quality control and taxonomic classification tool of prokaryotic genomes based on NCBI and GTDB taxonomies. We benchmarked DFAST_QC’s performance against NCBI taxonomy assignments, showing high consistency with them. Our results demonstrate that DFAST_QC achieves high consistency to NCBI taxonomy classification.

Availability and implementation

DFAST_QC is implemented in Python and is available both as a web service ( https://dfast.ddbj.nig.ac.jp/dqc ) and as a stand-alone command line tool. The source code is available under the GPLv3 license at: https://github.com/nigyta/dfast_qc , and the conda package is also available from Bioconda. The data and scripts used for the benchmarking process are publicly available on GitHub ( https://github.com/Mohamed-Elmanzalawi/DFAST_QC_Benchmark ).

Contact

yt@nig.ac.jp

Supplementary information

Supplementary data are available at Bioinformatics online.

Article activity feed