DeepMM: Identify and correct Metagenome Misassemblies with deep learning
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Accurate metagenomic assemblies are essential for constructing reliable metagenome-assembled genomes (MAGs). However, the complexity of microbial genomes continues to pose challenges for accurate assembly. Current reference-free assembly evaluation tools primarily rely on hand-crafted features and suffer from poor generalization across different metagenomic data. To address these limitations, we propose DeepMM, a novel deep learning-based visual model de-signed for the identification and correction of metagenomic misassemblies. DeepMM transforms alignments between assemblies and reads into a multi-channel image for misassembly feature learning and applies contrastive learning to bring different views of misassemblies closer. Fur-thermore, DeepMM offers a fine-tuning process to match different sequencer data. Our results show that DeepMM outperforms state-of-the-art methods in identifying misassemblies, achieving the highest AUPRC score in five CAMI datasets. DeepMM provides accurate correction of misassemblies, significantly improving downstream binning results, increasing the number of near-complete MAGs from 905 to 1006 in a large real metagenomic sequencing dataset derived from a diarrhea-predominant Irritable Bowel Syndrome (IBS-D) cohort.