scran::computeSumFactors with single-cell ATAC-seq data
Entering edit mode
Last seen 13 months ago
United Kingdom


What are your thoughts about using scran::computeSumFactors with (10x) single-cell ATAC-seq data? Should one use the default value of min.mean = 1 for read data? I called scater::calculateAverage on one my datasets and 89% and 23% of the peaks have mean count > 0.1 and > 1, respectively. The values are 18% and 1.5% respectively for a 10x single-cell RNA dataset (so I see why min.mean = 0.1 is needed for UMI data).

An ATAC-specific concern is that, as the size factor increases, the (measured) number of cuts in small peaks will stop increasing at some point and result in incorrect ratios.

scran • 576 views
Entering edit mode
Aaron Lun ★ 28k
Last seen 15 hours ago
The city by the bay

I don't handle scATAC-seq data personally, but if you're dealing with peaks, I would guess that the features are defined based on being somewhat high-abundance (e.g., in the pool across all cells). So, in effect, Cellranger has already done a bit of filtering for you, in contrast to the gene expression case where you just get a (possibly zero) count reported for all genes regardless of whether it's actually expressed or not. This would explain why you get a higher percentage of features with means above the threshold.

As for the specific threshold to use - if it's UMI data, you might as well use 0.1 and make use of more of your features. Remember, someone's already done the filtering for you, so there's no clear need to be even more aggressive with the filtering in computeSumFactors() on top of that.

I don't really understand what you mean by the number of cuts in small peaks. I assume you're referring to the fact that the coverage becomes capped by the fact that diploid cells only have 2 chromosomes. That's true enough, but if your size factor continues to increase regardless, it indicates that the cap isn't really in effect, e.g., due to PCR duplicates or whatever. (Assuming that the size factor calculation adjusts for composition biases.)


Login before adding your answer.

Traffic: 398 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6