Question: Modelling a multiplicative batch effect with DESeq2
0
13 days ago by
Samuel0
Samuel0 wrote:

As far as I understand, The process by which DESeq2 models a batch effect is close to : Subtracting the arithmetic mean of the batches' expression values from the genes, on a per-gene basis.

Is it possible in DESeq2 instead of modelling only an additive batch effect, to model a multiplicative one as well? For example, divide the expression levels per gene by the batch-specific geometric mean?

This is the context for my question: I have tried to visualize batch effect removal via ComBat. After that, the different conditions were plotted on a PCA, separating very nicely. Later, I tried performing differential expression (I do not pass the ComBat values into DESeq2, but rather model the batch effect using the formula "~ batch + condition" as the design ). Despite the separation in the PCA, there was a very low number of genes passing FDR (about 30). I suspect that the reason is that ComBat estimates both additive and multiplicative batch effects, while DESeq2 models only additive ones. Judging by the low number of DE genes, I suspect that multiplicative batch effects exist in my data.

P.S. However, each of the compared conditions has only two samples; that might be an alternative explanation for the low number of DE genes.

deseq2 combat batch-effects • 65 views
modified 12 days ago by Michael Love25k • written 13 days ago by Samuel0
Answer: Modelling a multiplicative batch effect with DESeq2
2
12 days ago by
Michael Love25k
United States
Michael Love25k wrote:

The process by which DESeq2 models a batch effect is close to : Subtracting the arithmetic mean of the batches' expression values from the genes, on a per-gene basis.

The way that DESeq2 models batch effects is exactly the same as how it models differences due to condition.

See the third paragraph of the Results section of the DESeq2 paper:

https://genomebiology.biomedcentral.com/articles/10.1186/s13059-014-0550-8#Sec2

or the third line of the equation block here:

http://bioconductor.org/packages/devel/bioc/vignettes/DESeq2/inst/doc/DESeq2.html#the-deseq2-model

We assume that the log2 of the expected mean of the Negative Binomial is explained by a linear combination of the covariates. So there will be a beta associated with the batch differences and a beta associated with the condition differences, in a model ~batch + condition.

So, given that we are modeling the log of the mean, we do have a multiplicative model for batch (or condition, or whatever covariate goes in the design).

Answer: Modelling a multiplicative batch effect with DESeq2
0
12 days ago by
University Hospital Jena, Germany
Nicolas Huber0 wrote:

I'm not an expert at all, so here just my thoughts: For a model.matrix if you don't know whether the effect is additive you would write it like this: model.matrix(~ diet + sex + diet:sex) OR model.matrix(~ diet*sex) (took it from http://genomicsclass.github.io/book/pages/expressingdesignformula.html)

In DESeq2 the standard workflow doesn't have this kind of input, but edgeR does in the glmQLFit function.