DFAST_QC: Quality Assessment and Taxonomic Identification Tool for Prokaryotic Genomes
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Motivation
Accurate taxonomic assignments of genomic data are crucial across various biological databases. With a rapid increase in submitted genomes in recent years, ensuring precise classification is important to maintain database integrity. Mislabeled genomes can confuse researchers, hinder analyses, and produce false results. Therefore, there is a critical need for computationally efficient tools that ensure accurate taxonomic classification for data to be deposited into genomic databases.
Results
Here we introduce DFAST_QC, a quality control and taxonomic classification tool of prokaryotic genomes based on NCBI and GTDB taxonomies. We benchmarked DFAST_QC’s performance against NCBI taxonomy assignments, showing high consistency with them. Our results demonstrate that DFAST_QC achieves high consistency to NCBI taxonomy classification.
Availability and implementation
DFAST_QC is implemented in Python and is available both as a web service ( https://dfast.ddbj.nig.ac.jp/dqc ) and as a stand-alone command line tool. The source code is available under the GPLv3 license at: https://github.com/nigyta/dfast_qc , and the conda package is also available from Bioconda. The data and scripts used for the benchmarking process are publicly available on GitHub ( https://github.com/Mohamed-Elmanzalawi/DFAST_QC_Benchmark ).
Contact
yt@nig.ac.jp
Supplementary information
Supplementary data are available at Bioinformatics online.