A Cosine Similarity-Based Method to Infer Variability of Chromatin Accessibility at the Single-Cell Level

Cai, S., Georgakilas, G. K., Johnson, J. L., Vahedi, G.

Frontiers in Genetics


Cellular identity between generations of developing cells is propagated through the epigenome particularly via the accessible parts of the chromatin. It is now possible to measure chromatin accessibility at single-cell resolution using single-cell assay for transposase accessible chromatin (scATAC-seq), which can reveal the regulatory variation behind the phenotypic variation. However, single-cell chromatin accessibility data are sparse, binary, and high dimensional, leading to unique computational challenges. To overcome these difficulties, we developed PRISM, a computational workflow that quantifies cell-to-cell chromatin accessibility variation while controlling for technical biases. PRISM is a novel multidimensional scaling-based method using angular cosine distance metrics coupled with distance from the spatial centroid. PRISM takes differences in accessibility at each genomic region between single cells into account. Using data generated in our lab and publicly available, we showed that PRISM outperforms an existing algorithm, which relies on the aggregate of signal across a set of genomic regions. PRISM showed robustness to noise in cells with low coverage for measuring chromatin accessibility. Our approach revealed the previously undetected accessibility variation where accessible sites differ between cells but the total number of accessible sites is constant. We also showed that PRISM, but not an existing algorithm, can find suppressed heterogeneity of accessibility at CTCF binding sites. Our updated approach uncovers new biological results with profound implications on the cellular heterogeneity of chromatin architecture.