WAD: a wavelet-based linear programming method using L1-minimal reconstruction loss for accessible chromatin data deconvolution
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Bulk tissue-based accessible chromatin studies provide summary annotations across all cell types within the tissue. These annotations can be skewed by varying proportions of individual cell types, especially in the context of disease studies. Estimated sample specific cell-type proportions can be used to mitigate effects of this variability while also addressing whether there exist significant alterations in cell proportions under certain conditions, like disease. We present WAD (Wavelet-based Accessible chromatin Deconvolution), a principled framework for robust estimation of cell type composition of bulk accessible chromatin data such as from the ATAC-seq assay. To determine informative reference cell profiles from single-cell accessible chromatin studies, WAD leverages wavelet-based denoising to suppress stochastic noise while preserving local chromatin continuity. Cell type proportion inference is reformulated as an L1-minimal linear programming problem, enabling scalable and interpretable solutions. Across 700 in silico pseudo-bulk mixtures generated from single-cell data, WAD achieved a consistently lower mean absolute error (MAE) and higher concordance (r > 0.85) than existing machine learning-based methods. These results demonstrate that wavelet-based feature extraction provides a biologically grounded and computationally efficient approach to chromatin signal deconvolution. A complete implementation of WAD is available at https://github.com/chae-jh/WAD.