Modelling a multiplicative batch effect with DESeq2
2
0
Entering edit mode
Sam ▴ 10
@sam-21502
Last seen 5 weeks ago
Jerusalem

As far as I understand, The process by which DESeq2 models a batch effect is close to : Subtracting the arithmetic mean of the batches' expression values from the genes, on a per-gene basis.

Is it possible in DESeq2 instead of modelling only an additive batch effect, to model a multiplicative one as well? For example, divide the expression levels per gene by the batch-specific geometric mean?

This is the context for my question: I have tried to visualize batch effect removal via ComBat. After that, the different conditions were plotted on a PCA, separating very nicely. Later, I tried performing differential expression (I do not pass the ComBat values into DESeq2, but rather model the batch effect using the formula "~ batch + condition" as the design ). Despite the separation in the PCA, there was a very low number of genes passing FDR (about 30). I suspect that the reason is that ComBat estimates both additive and multiplicative batch effects, while DESeq2 models only additive ones. Judging by the low number of DE genes, I suspect that multiplicative batch effects exist in my data.

P.S. However, each of the compared conditions has only two samples; that might be an alternative explanation for the low number of DE genes.

deseq2 batch-effects ComBat • 349 views
ADD COMMENT
2
Entering edit mode
@mikelove
Last seen 5 hours ago
United States

The process by which DESeq2 models a batch effect is close to : Subtracting the arithmetic mean of the batches' expression values from the genes, on a per-gene basis.

The way that DESeq2 models batch effects is exactly the same as how it models differences due to condition.

See the third paragraph of the Results section of the DESeq2 paper:

https://genomebiology.biomedcentral.com/articles/10.1186/s13059-014-0550-8#Sec2

or the third line of the equation block here:

http://bioconductor.org/packages/devel/bioc/vignettes/DESeq2/inst/doc/DESeq2.html#the-deseq2-model

We assume that the log2 of the expected mean of the Negative Binomial is explained by a linear combination of the covariates. So there will be a beta associated with the batch differences and a beta associated with the condition differences, in a model ~batch + condition.

So, given that we are modeling the log of the mean, we do have a multiplicative model for batch (or condition, or whatever covariate goes in the design).

ADD COMMENT
0
Entering edit mode
Solarion • 0
@solarion-22030
Last seen 12 months ago
University Hospital Jena, Germany

I'm not an expert at all, so here just my thoughts: For a model.matrix if you don't know whether the effect is additive you would write it like this: model.matrix(~ diet + sex + diet:sex) OR model.matrix(~ diet*sex) (took it from http://genomicsclass.github.io/book/pages/expressingdesignformula.html)

In DESeq2 the standard workflow doesn't have this kind of input, but edgeR does in the glmQLFit function.

ADD COMMENT

Login before adding your answer.

Traffic: 333 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6