Should edgeR-GLM on single cell RNA data be performed on the counts or normalized data?
2
0
Entering edit mode
amckenz • 0
@amckenz-11264
Last seen 3.2 years ago

I am asking a question relevant to this previous bioconductor-support question: modeling zero-dominated RNA-seq with voom/limma and hurdle models (pscl)

I am wondering: is it better to perform edgeR-GLM on single cell data on the original counts or on normalized data, potentially normalized using scran? 

For clarity, below is the pipeline I am currently planning to use. I am wondering if I should perform the glmFit and estimateDisp steps on the counts or the normalized data. It seems to me that I should do it on the counts, because as far as I can tell this is what is typically done for edgeR, but I want to be sure. 

disp = estimateDisp(counts, design, robust = TRUE)
fit = glmFit(counts, design = design, dispersion = disp)
contrast_matrix = makeContrasts(MAIN-OTHERS, levels = as.factor(groups)

fit2 = glmLRT(fit, contrast_matrix)

toptable = topTags(fit2, adjust.method = "BH", sort.by = "none", n = nrow(fit2))

edger scran • 1.3k views
ADD COMMENT
1
Entering edit mode
davis ▴ 90
@davis-8868
Last seen 6.4 years ago
United Kingdom

You should use the counts with edgeR.

scran computes size-factors for normalization comparable to those from TMM, but with some smart adjustments to appropriately compute size factors from scRNA-seq data with lots of dropouts. As such, size factors from scran are used in an edgeR workflow in the same way as TMM size factors or similar would be. 

If you are using scran with an SCESet object for the normalization, then checkout the "convertTo" function to produce a DGEList object ready for analysis with edgeR.

 

ADD COMMENT
0
Entering edit mode
@steve-lianoglou-2771
Last seen 13 months ago
United States

You should take a look at scran.

ADD COMMENT

Login before adding your answer.

Traffic: 878 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6