GeNePi: a GPU-enhanced Next Generation Bioinformatics Pipeline for Whole Genome Sequencing Analysis

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Next Generation Sequencing (NGS) has revolutionized genome biology, enabling the rapid sequencing of an entire human genome and facilitating the integration of Whole Genome Sequencing (WGS) into both research and clinical applications. The high-throughput nature of NGS and the complex data processing required has driven the need for advanced computational infrastructures to analyse these large datasets. The aim of this work is to introduce an innovative bioinformatic pipeline, named GeNePi, for the efficient and precise analysis of WGS short paired-end reads. Built on the Nextflow framework with a modular structure, GeNePi incorporates GPU-accelerated algorithms and supports multiple workflow configurations. The pipeline automates the extraction of biologically relevant insights from raw WGS data, including: disease-related variants such as single nucleotide variants (SNVs), small insertions or deletions (INDELs), copy number variants (CNVs), and structural variants (SVs). Optimized for high-performance computing (HPC) environments, it takes advantage of job-scheduler submissions, parallelised processing, and tailored resource allocation for each analysis step. Tested on synthetic and real datasets, GeNePi accurately identifies genomic variants, with performances comparable to that of state-of-art tools. These features make GeNePi a valuable instrument for large-scale analyses in both research and clinical contexts, representing a key step towards the establishment of National Centers for Computational and Technological Medicine.

Article activity feed