Open chromatin-guided interpretable machine learning reveals cancer-specific chromatin features in cell-free DNA
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Cell-free DNAs (cfDNAs) are DNA fragments found in blood, originating mainly from immune cells in healthy individuals and from both immune and cancer cells in cancer patients. While cancer-derived cfDNAs carry mutations, they also retain epigenetic features such as DNA methylation and nucleosome positioning. In this study, we examine nucleosome enrichment patterns in cfDNAs from breast and pancreatic cancer patients and find significant enrichment at open chromatin regions. Differential enrichment is observed not only at cancer cell type specific ATAC-seq peaks but also at CD4 + T cell specific peaks, suggesting both tumor- and immune-derived contributions to the cfDNA signal. To leverage these patterns, we apply an interpretable machine learning model (XGBoost) trained on cell type specific open chromatin regions. This approach improves cancer detection accuracy and highlights key genomic loci associated with the disease state. Our pipeline provides a robust and interpretable framework for cfDNA-based cancer detection.