Imputing not available values in single-cell DNA methylation data using the median is straightforward and effective

Read the full article

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Recent advances in single-cell DNA methylation have provided unprecedented opportunities to explore cellular epigenetic differences with maximal resolution. Due to the number of methylation sites exceeding the computational limits of current analytical methods, a common workflow for single-cell DNA methylation analysis is binning the genome into multiple regions and computing the average methylation level within each region. In this process, imputing not available (NA) values which are caused by the limited number of captured methylation sites is a necessary preprocessing step for downstream analyses. Existing studies have employed several simple imputation methods (such as zero imputation or mean imputation); however, there is a lack of theoretical studies or benchmark tests evaluating these approaches. Through both experiments and theoretical analysis, we found that using the median to impute missing data can effectively and simply reflect the methylation state of the NA values, providing an accurate foundation for downstream analyses.

Article activity feed