Question: Accounting for batch effect in DESseq2
0
7 months ago by
gtechbio0
gtechbio0 wrote:

Dear all,

I am having a problem with removing/accounting for the batch effect in my RNAseq experiment using DESeq2.

Initially we did 1 big time-course experiment in one batch of human cells growing with fungus.

The colData is the following:

time
1     0
2     0
3   1.5
4   1.5
5   1.5
6     3
7     3
8     3
9    12
10   12
11   12
12   24
13   24
14   24
15   24
16  24c
17  24c


Then I was comparing all time points against 0 using contrasts.

After getting some initial results, we decided to perform another experiment at 3h with a different fungus (lets call it 3D) to compare the difference between 0vs3 and 3vs3D.

After analyzing the data (not only what is mentioned above, but a similar experimental design with 3 other fungal species) I realized that 3D samples have a strong batch effect (I have seen this by removing the batch effect using limma).

For the case above, when I want to add the batch variable to the design, I get the error the model matrix is not full rank, which of course makes sense.

I am wondering is there is any workaround for this problem? For example, can I somehow use the output of limma removeBatchEffect to construct a new model and make DE analysis? Thank you

deseq2 batch_effect • 182 views
modified 7 months ago by Michael Love25k • written 7 months ago by gtechbio0
Answer: Accounting for batch effect in DESseq2
1
7 months ago by
Michael Love25k
United States
Michael Love25k wrote:

It sounds like you have a confounded design. You should have included some 0's in the new batch if you wanted to learn about the 3 vs 0 difference, but as it is, you cannot make much use of these extra samples. You don't know which is closer to the true level of expression, the mean level of expression from 3 or 3D, all you know is that they are different from each other, separated by a batch effect.

Hi Michael, Thanks for the reply. If I would have known in advance the genes which are not differentially expressed in 3 and 3D compared to 0 (basically genes which are intact to both types of treatment) without the batch effect, can I use these genes to somehow infer the the strength of batch effect given that with batch effect these genes change expression?

Thank you

1

I don't really follow, but generally DESeq2 won't make much use of these samples, especially as you can't tell from the data what's batch and what's treatment effect.

Thanks for swift reply Michael! Sorry for not explaining properly, but it seems that what I meant is implemented in RUVg package http://bioconductor.org/packages/release/bioc/vignettes/RUVSeq/inst/doc/RUVSeq.pdf (chapter 2.2), which can be coupled with DESeq2. I am wondering if you have any experience with this package? Maybe I should open a separate question for this. Thanks again for support

1

Yeah, feel free to post an RUVg question (their answer will be applicable broadly to all the RNA-seq DE packages). But it's outside the scope of DESeq2 methods.