Chlomito: a novel tool for precise elimination of organelle genome contamination from nuclear genome assembly
This article has been Reviewed by the following groups
Listed in
- Evaluated articles (Arcadia Science)
Abstract
Accurate reference genomes are fundamental to understanding biological evolution, biodiversity, hereditary phenomena and diseases. However, many assembled nuclear chromosomes are often contaminated by organelle genomes, which will mislead bioinformatic analysis, and genomic and transcriptomic data interpretation.
Methods
To address this issue, we developed a tool named Chlomito, aiming at precise identification and elimination of organelle genome contamination from nuclear genome assembly. Compared to conventional approaches, Chlomito utilized new metrics, alignment length coverage ratio (ALCR) and sequencing depth ratio (SDR), thereby effectively distinguishing true organelle genome sequences from those transferred into nuclear genomes via horizontal gene transfer (HGT).
Results
The accuracy of Chlomito was tested using sequencing data from Plum, Mango and Arabidopsis . The results confirmed that Chlomito can accurately detect contigs originating from the organelle genomes, and the identified contigs covered most regions of the organelle reference genomes, demonstrating efficiency and precision of Chlomito. Considering user convenience, we further packaged this method into a Docker image, simplified the data processing workflow.
Discussion
Overall, Chlomito provides an efficient, accurate and convenient method for identifying and removing contigs derived from organelle genomes in genomic assembly data, contributing to the improvement of genome assembly quality.
Article activity feed
-
-
nd 512 GB of RA
Is this the recommended amount of RAM to run chlomito? this is quite high.
-
This link gave me a 404
-
e. By combining these two metrics, we can significantly improve the83accuracy of identifying and removing organelle genome sequences from genome assembly d
I'm assuming the second metric relies on mapped reads. Did you consider identifying spanning reads as further evidence for your tool? If a read spans an organellar genome sequence and nuclear genome sequence (perhaps with k=21 bp overlap at minimum, or potentially higher), then I think that would show evidence of an HGT event
-
ce, we packaged this29method into a Docker ima
Is chlomito available on GitHub? Where can the docker image be downloaded from?
-
Plum and Mang
Would you be willing to provide details on the quality of these two genomes? How well known are the chloroplast and mitochondrial sequences in these (do they have gold-standard labels?)?
-