Limma design: paired samples, batch effect
1
0
Entering edit mode
John Burger ▴ 10
@john-burger-3380
Last seen 11.1 years ago

Hello,

This is yet again a question about limma design matrices. I would like to perform analysis on single channel arrays (Affy). The experimental design is: paired samples before and after treatment, and I would also like to take into account batch effect.

I am having problems with the batch effect, by reading through mails in this mailing list, I've gotten the impression that you can include batch to your design matrix, but when I try to do that, I get an error message. Here is a simulated example of what I am trying to do:

Create example date, patient (5 patients, 2 samples), treatment (A or B), and batch (three batches, 1-3)

> patient <- factor(rep(1:5,2))
> patient
 [1] 1 2 3 4 5 1 2 3 4 5
Levels: 1 2 3 4 5
> treatment <- factor(c(rep("A",5), rep("B",5)))
> treatment
 [1] A A A A A B B B B B
Levels: A B
> batch <- factor(c(rep(1,3), rep(2,3), rep(3,4)))
> batch
 [1] 1 1 1 2 2 2 3 3 3 3
Levels: 1 2 3

If I would like to perform a basic analysis between the paired samples and not taking into account batch effect, I could just do:

> design <- model.matrix(~patient+treatment)
> design
   (Intercept) patient2 patient3 patient4 patient5 treatmentB
1            1        0        0        0        0          0
2            1        1        0        0        0          0
3            1        0        1        0        0          0
4            1        0        0        1        0          0
5            1        0        0        0        1          0
6            1        0        0        0        0          1
7            1        1        0        0        0          1
8            1        0        1        0        0          1
9            1        0        0        1        0          1
10           1        0        0        0        1          1

And then use treatmentB to obtain the differences between treatments A and B.

But, how could I include batch effect to the model? My inital thought was:

> design <- model.matrix(~patient+treatment+batch)
> design
   (Intercept) patient2 patient3 patient4 patient5 treatmentB batch2 batch3
1            1        0        0        0        0          0      0      0
2            1        1        0        0        0          0      0      0
3            1        0        1        0        0          0      0      0
4            1        0        0        1        0          0      1      0
5            1        0        0        0        1          0      1      0
6            1        0        0        0        0          1      1      0
7            1        1        0        0        0          1      0      1
8            1        0        1        0        0          1      0      1
9            1        0        0        1        0          1      0      1
10           1        0        0        0        1          1      0     1

But when I do that with actual data, I get the following error message:

Coefficients not estimable: batch2 batch3
Warning message:
Partial NA coefficients for 10163 probe(s)

Any ideas what I am doing wrong, and more importantly, what I should do to take that batch effect into account?

Thanks,
John

limma • 2.6k views
ADD COMMENT
2
Entering edit mode
@gordon-smyth
Last seen 3 hours ago
WEHI, Melbourne, Australia

Dear John,

You can't expect to be able to estimate more things than your experiment contains information about.  In this case, your batches are entirely confounded with patients+treatment, and hence you can't estimate coefficients for them. That is what the message from limma is telling you.

BTW, you got a diagnostic message and a warning, not an "error" message.

Best wishes
Gordon

ADD COMMENT

Login before adding your answer.

Traffic: 1017 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6