add group info within DGE-object before TMM normalization
1
0
Entering edit mode
R ▴ 40
@r-5604
Last seen 24 days ago
Germany

I wonder, in order to do correct TMM-normalization, do I have to add groups in the DGE-object so that the calcNormFactors function knows which groups to perform the normalisation between?

dge <- DGEList(counts,group=ss$groups) #Should I include group=ss$groups??
dge <- calcNormFactors(dge, method="TMM")
des <- model.matrix(~0+ss\$groups)
v <- voom(dge,plot = T)
fit <- lmFit(v, design=des)
...
...


What I experience is that I get slight different number of differentially expressed genes depending on if I include group or not in the DGE-object. If I don't include group, does it normalize between all samples, and not specific groups? Which one is more correct?

edgeR limma • 114 views
0
Entering edit mode
@gordon-smyth
Last seen 9 hours ago
WEHI, Melbourne, Australia

Setting the group vector makes no difference to calcNormFactors and TMM normalization. Group membership is not used (and should not be used) in the normalization step.

On the other hand, setting the group information does make a difference to voom if you run it without a design matrix as you are doing. Normally one passes the design matrix to voom but you have not, hence voom tries to work out what the design matrix should be from the group vector. If you neither pass the design matrix nor set the group vector then voom has no choice but to treat all samples as replicates, which naturally is not correct.

More generally, it is never wrong to set the group factor as part of the DGEList. It might be used or ignored but it will never make an analysis incorrect.

0
Entering edit mode

I set design to the fit object, please see update. is that wrong? I seem to get a few more significant genes when including design in voom.

0
Entering edit mode

Your questions are answered by the documentation. You need to use the design matrix both for voom and for the fitting linear models. That's what the design matrix is for.