DIANA: An integrated pipeline for analysis of long-read whole-genome sequencing data for molecular neuropathology

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Summary

Central nervous system (CNS) tumor diagnosis requires comprehensive genomic profiling including DNA-methylation classification, copy-number variants (CNV), gene fusion analysis, small variant detection and MGMT promoter methylation status. Long-read sequencing platforms such as nanopore sequencing by Oxford Nanopore Technologies and SMRTseq by PacBio can capture all these in a single assay, but integrating diverse analytical tools to leverage the advantages of long-read sequencing remains complex. We present DIANA ( D iagnostic I ntegrated A nalytics of N eoplastic A lterations ) , a pipeline providing fully automated end-to-end processing of long-read whole-genome sequencing data from aligned sequence reads. DIANA produces a human readable report that combines methylation classification with prioritized genetic variants to support CNS tumor diagnostics and clinical decision-making.

Availability and implementation

DIANA is an open-source Nextflow pipeline, freely available through Docker or Apptainer/Singularity technologies. The source code, comprehensive documentation, and installation protocols are available on GitHub: https://github.com/VilhelmMagnusLab/DIANA.git .

Supplementary information

Supplementary data are available at Bioinformatics online.

Article activity feed