Early Lung Cancer Detection Using Nucleotide Transition Probabilities in plasma cell-free DNA

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Lung cancer, the most lethal malignancy globally, urgently requires effective early detection methods. Current non-invasive approaches based on plasma cell-free DNA (cfDNA) fragmentomics are often constrained by limited sensitivity in early-stage patients due to low tumor DNA fraction. To overcome this, we introduce a novel computational feature—First-Order Transition Probability (FOTP)—to decode nucleotide sequential dependencies within cfDNA fragments. Through systematic analysis of 1,036 participants and low-pass whole-genome sequencing, we demonstrate that the first 10 bp at the 5′ end harbor the most discriminative information for cancer detection. An SVM model leveraging FOTP achieved an AUC of 0.942, with 73.9% sensitivity for stage I and 81.8% for stage II lung cancer at 95% specificity, significantly outperforming existing fragmentomic features. Furthermore, the method generalized robustly across independent and multi-cancer validation sets, including HCC, CRC, and HNSCC, and exhibited potential for tissue-of-origin identification. These findings are supported by nucleotide frequency stability and entropy patterns beyond the initial 10 bp, reflecting underlying nuclease cleavage biases and chromatin features. This work establishes FOTP as a biologically interpretable and highly efficient feature for pan-cancer early detection, offering a scalable pathway toward population-wide screening programs.

Article activity feed