Question

DESeq2, sizeFactors in different comparisons; betaPrior=F

1

Entering edit mode

elenigeorgopoulou86 ▴ 10

@elenigeorgopoulou86-8300

Last seen 9.5 years ago

Austria

Hello!

I have 2 questions. The first one is regarding the sizeFactors.

I have the following sample groups:

1. control = no treatment: 5 biol. replicates

2. after 1h of treatment: 5 biol. replicates

3. after 2 h of the same treatment: 5 biol. replicates.

(summa summarum: 1 factor, 3 levels, 15 animals)

My task was to find DE genes between the groups. So I did pairwise comparisons: i.1 vs.2, ii.1vs.3 and iii.2 vs.3.

The sizeFactors were computed for the each comparison: it means that the normalized counts for samples from group1 can be different in e.g. comparisons i and ii.

My question is: is this wrong? Should I instead compute the sizeFactors for the all samples before testing, and not just for the compared ones?

The second question is regarding the analysis without an intercept, which I did here. Before 3-4 weeks, I did not get any error message when my design was without an intercept. But today, when I repeated the analysis, I got an error message:

betaPrior=TRUE can only be used if the design has an intercept.

if specifying + 0 in the design formula, use betaPrior=FALSE

So, I added in DESeq function an argument betaPrior=FALSE, and got the extended/different set of genes (e.g. under 5% FDR).

My 2nd question is just for sanity check: is there something changed in the code during this period of time, since my input hasn't changed, and I used the same code? If so, why betaPrior has to be =FALSE when the design is without an intercept?

Thank you in advance!

Eleni

deseq2 betaPrior=FALSE sizeFactors • 3.1k views

ADD COMMENT • link updated 9.5 years ago by Michael Love 43k • written 9.5 years ago by elenigeorgopoulou86 ▴ 10

score 2 · Answer 1 · 2015-07-01

1. " Should I instead compute the sizeFactors for the all samples before testing, and not just for the compared ones?"

My typical recommendation would be to put all the groups into one DESeqDataSet, with a design ~condition, run DESeq() on the whole dds, and then extract comparisons like:

results(dds, contrast=c("condition","treatment1hr","control"))
results(dds, contrast=c("condition","treatment2hr","treatment1hr"))
etc.

The size factors will stay the same this way and there are more degrees of freedom for estimating the dispersion parameter.

2. In version 1.6 (released October 2014), I added an error check, because this is a bad idea (to combine beta prior and a design without an intercept). I would have added this error check earlier if I had thought of the situation.

The reason it's a bad idea is that, we want to shrink differences between samples. These are represented by the coefficients in the model that look like "conditionTreated", or "condition_Treated_vs_Control", when there is also a term "Intercept". However, when you remove the intercept term, the coefficients above no longer represent the differences between samples, but the vector from 0 to the group mean. We do not want to shrink these terms to zero. In the second case, shrinkage provides no statistical benefit while adding bias, whereas the shrinkage of differences (in models with an intercept) adds a little bias while reducing error. (See our paper for more intuition on the shrinkage of differences).