Question

Aclaration of batch effect adjust in DE analysis

0

Entering edit mode

IRAIA.MAIALEN • 0

@iraiamaialen-19272

Last seen 6.9 years ago

Hi everyone! I have a problem understanding the protocol of edgeR, where a batch effect is adjusted in differential expression analysis (4.2.8).

I have also a batch effect in my samples and I have constructed the design matrix like this:

> design
             conditionControl conditionMorphine batch2
C_P1_54                1                 0      0
C_P1_55                1                 0      1
M_P1_60                0                 1      0
M_P1_61                0                 1      1
attr(,"assign")
[1] 1 1 2
attr(,"contrasts")
attr(,"contrasts")$condition
[1] "contr.treatment"

attr(,"contrasts")$batch
[1] "contr.treatment"

Is this correct?

Moreover, I have followed the example of edgeR protocol from the top to the final step, but I don't understand how the batche effect is taken into account in the final contrast. It says "First we check whether there was a genuine need to adjust for the experimental times. We do this by testing for differential expression between the three times. There is considerable differential expression, justifying our decision to adjust for the batch effect:" And then "Now conduct QL F-tests for the pathogen effect and show the top genes. By default, the test is for the last coefficient in the design matrix, which in this case is the treatment effect:" So am I correct thinking that in the second test the batch effect is taken into account even if we don't add it to the commandd? The fact that we have been including it during all the analysis makes that the values are already adjusted? Or am I missing something important?

Thanks in advance

edger rnaseq batch effect • 1.4k views

ADD COMMENT • link updated 6.9 years ago by James W. MacDonald 68k • written 6.9 years ago by IRAIA.MAIALEN • 0

score 1 · Answer 1 · 2019-01-11

1

Entering edit mode

James W. MacDonald 68k

@james-w-macdonald-5106

Last seen 2 days ago

United States

Yes it's correct, assuming that you have two batches and one sample of each condition in each batch.

The batch effect isn't taken into account in the final contrast. It was taken into account when you fit the model. You can see if the batch effect is necessary in two ways. First, you could plot a MDS or PCA plot of your samples and see if there appears to be a batch effect. Secondly, you could test for significance of your third coefficient (the batch effect), and if you get lots (for some definition of 'lots') of genes that are significant, then you can say it is probably necessary.

ADD COMMENT • link 6.9 years ago James W. MacDonald 68k

0

Entering edit mode

Thank you!!!!!! very nice explanation!! I show in the protocol that they do which you are explaining to see the significance, but I didn't know for what it was. And I get confused with the final results. So, fit the model with the batch effect design matrix and do the final test with control and treatment. Thanks!!

ADD REPLY • link 6.9 years ago IRAIA.MAIALEN • 0