Question

Removinf Batch effect in EDGE-R

0

Entering edit mode

kaihami ▴ 30

@kaihami-10979

Last seen 7.2 years ago

I have a naive question,

Even though I've seen a lot of question regarding this issue (batch effect) in RNA-Seq analysis. I still don't understand exactly what EDGE-R is performing during this procedure.

For example I have only 2 different conditions (let's call WT and Mutant).

In this case I can:

group <- factor(c(rep("WT", 3),
rep("Mutant",3)))

and create a batch factor:

batch <- factor(c("1", "1", "2",
"1", "1", "2"))

Then:

design <- model.matrix(~group+batch, data = y$samples )

or:

design <- model.matrix(~batch+group, data = y$samples)

design

        (Intercept) groupMutant batch2
WT-1            1          0                0
WT-2            1          0                0
WT-3            1          0                1
Mut-1           1          1                0
Mut-2           1          1                0
Mut-3           1          1                1

How do I perform downstream analysis?

Specifically in lrt:

lrt <- glmLRT(fit,coef =?)

which coefficient should I use? If I trully understand I should use coef = 2 right?

But if I use only coef = 2 will it take into account the batch effect? or should I use coef = 2:3? Or even glmLRT(fit)?

What is going under hood?

Thank you very much,

Regards

R edger batch effect • 962 views

ADD COMMENT • link updated 7.2 years ago by Aaron Lun ★ 28k • written 7.2 years ago by kaihami ▴ 30

score 1 · Answer 1 · 2017-02-04

Yes, you should use coef=2 in glmLRT if you want to study the mutant effect. Yes, the batch effect will be accounted for, as the third coefficient will absorb any systematic difference between WT/Mut-3 and the other samples. This is done automatically, you don't have to specify it in glmLRT.

The support site isn't the place to explain the nitty gritty of how edgeR works. If you want to know more, I suggest that you first try to understand linear models - Google is your friend here - moving onto to generalized linear models, and then reading the various edgeR publications (see the user's guide for specific citations).