Benchmarking Alignment Strategies for Hi-C Reads in Metagenomic Hi-C Data
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Background
Metagenomics combined with High-throughput Chromosome Conformation Capture (Hi-C) offers a powerful approach to study microbial communities by linking genomic content with spatial interactions. Hi-C enhances shotgun sequencing by revealing taxonomic composition, functional interactions, and genomic organization from a single sample. However, aligning Hi-C reads to metagenomic contigs presents challenges, including the unique statistical distribution of Hi-C paired-end reads, multi-species complexity, and gaps in assemblies. Although many benchmark studies have evaluated general alignment tools and Hi-C data alignment, none have specifically addressed metagenomics Hi-C data.
Results
Here, we selected seven alignment strategies that have been used in Hi-C analyses: BWA MEM -5SP, BWA MEM default, BWA aln default, Bowtie2 default, Bowtie2 –very-sensitive-local, Minimap2 default, and Chromap default. We benchmarked them on one synthetic and seven real-world environments, and evaluated these tools based on the number of inter-contig Hi-C read pairs and their influence on downstream tasks, such as binning quality.
Conclusion
Our findings show that BWA MEM -5SP consistently outperforms other tools across all environments in terms of inter-contig read pairs and binning quality, followed by BWA MEM default. Chromap and Minimap2, while less effective in these metrics, demonstrate the highest computational efficiency.