Accurate MAG reconstruction from complex soil microbiome through combined short- and HiFi long-reads metagenomics

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Background

Advances in high-fidelity long-read (HiFi-LR) sequencing technologies offer unprecedented opportunities to uncover the microbial genomic diversity of complex environments, such as soils. While short-read (SR) sequencing has enabled broad insights at gene-level diversity, the inherently limited read length constrains the reconstruction of complete genomes. Conversely, HiFi-LR sequencing enhances the quality and completeness of metagenome-assembled genomes (MAGs), enabling higher-resolution taxonomic and functional annotation. However, the cost and relatively low throughput of HiFi-LR sequencing can limit genome recovery, particularly at the binning stage, where coverage depth is critical.

Results

Here, we present a novel hybrid strategy that differs from classical hybrid assemblies, where SR and LR reads are jointly used at the assembly step. Instead, we use high-depth SR data to improve the binning of HiFi-LR contigs. Using both SR and HiFi-LR metagenomic data generated from a tunnel-cultivated soil sample, we demonstrate that SR-derived coverage information significantly improves the binning of HiFi-LR assemblies. This results in a substantial increase in the number and quality of recovered MAGs compared to using HiFi-LR data alone and an uncomparable improvement compared to SR data alone.

Conclusion

Our findings highlight the power of combining SR and LR in highly diverse environments, such as soil, not for hybrid assembly per se, but to enhance the downstream binning process. The combination of SR and LR data substantially improves the downstream binning process and overall genome recovery. Importantly, this approach underscores the potential of leveraging the vast amount of publicly available Illumina metagenomic datasets. Completing existing SR resources with PacBio HiFi sequencing can maximise assembly contiguity and binning accuracy using massive amounts of SR data already generated. This highlights a practical and forward-looking strategy for microbiome research, where novel LR technologies will bring new value to previous short-read efforts.

Article activity feed