An nf-core framework for the systematic comparison of alternative modeling tools: the multiple sequence alignment case study
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
The computational complexity of many key bioinformatics problems has resulted in numerous alternative heuristic solutions, where no single approach consistently outperforms all others. This creates difficulties for users trying to identify the most suitable tool for their dataset and for developers managing and evaluating alternative methods. As data volumes grow, deploying these methods becomes increasingly difficult, highlighting the need for standardized frameworks for seamless tool deployment and comparison in high-performance computing (HPC) environments. Multiple sequence aligners (MSAs) rank among the most commonly employed modeling techniques in bioinformatics, playing a crucial role in applications such as protein structure prediction, phylogenetic reconstruction, and variant effect prediction. MSAs are NP-hard problems, which makes them a major example of computational challenges where heuristic solutions are essential. Here, we present a pilot design of an nf-core framework for streamlined tool deployment and rigorous performance evaluation focusing on the MSA software ecosystem. While showcased with the integration of popular MSA tools and designed to directly benefit the MSA community, we also present the framework as a proof of principle for the broader bioinformatics community. nf-core/multiplesequencealign is free open-source software available at https://nf-co.re/multiplesequencealign.