Question

edgeR design matrix

1

Entering edit mode

dongamit123 ▴ 20

@dongamit123-8091

Last seen 9.4 years ago

United States

Hi,

I have 5 individuals with a paired design - before treatment and after treatment. The RNA-seq was done in two batches, so I need to also correct for the batch effect. The data looks like this:

Subject	Batch	Treatment
1	2	0
2	1	0
3	2	0
4	1	0
5	2	0
1	2	1
2	1	1
3	2	1
4	1	1
5	2	1

I need to test the effect of treatment, adjusting for differences between the individuals and the batch effect. I am struggling with the design matrix. I had tried: design <- model.matrix(~grp+patient+batch). However, I am getting the error "Design matrix not of full rank. " . I would be really grateful if someone could help me with the design matrix.

Thanks,

AD

edger multiple factor design • 4.4k views

ADD COMMENT • link 9.5 years ago dongamit123 ▴ 20

score 2 · Answer 1 · 2015-06-05

2

Entering edit mode

James W. MacDonald 67k

@james-w-macdonald-5106

Last seen 1 day ago

United States

You don't have to account for batch (nor can you), as it is confounded with subject. Just fit the model

design <- model.matrix(~grp+patient)

ADD COMMENT • link 9.5 years ago James W. MacDonald 67k

score 0 · Answer 2 · 2015-06-05

0

Entering edit mode

dongamit123 ▴ 20

@dongamit123-8091

Last seen 9.4 years ago

United States

Thank you for the quick response. In your opinion what would be a better approach - analyze both batches together or analyze batches 1 and 2 separately and take the common DE genes between both batches?

Thanks again,

AD

ADD COMMENT • link 9.5 years ago dongamit123 ▴ 20

1

Entering edit mode

You want to analyze the data all together. Paired analyses are more powerful, because you are accounting for subject-level variability by virtue of the design.

ADD REPLY • link 9.5 years ago James W. MacDonald 67k

score 0 · Answer 3 · 2015-06-05

0

Entering edit mode

dongamit123 ▴ 20

@dongamit123-8091

Last seen 9.4 years ago

United States

Thank you. That makes sense. Just to make sure I understand the design matrix correctly - if I have the following data (I changed the batch for individual 3).

Subject	Batch	Treatment
1	2	0
2	1	0
3	1	0
4	1	0
5	2	0
1	2	1
2	1	1
3	2	1
4	1	1
5	2	1

Would the following test for the differences between the treatments while correcting for batches and patient-specific effects:

design <- model.matrix(~batch+patient+grp)

fit <- glmFit(y,design)

lrt.RAvsC <- glmLRT(fit)

I appreciate your help.

Thanks, AD

ADD COMMENT • link 9.5 years ago dongamit123 ▴ 20

1

Entering edit mode

Patient 3 will contribute no useful information in this altered design. Consider this; if the "after" sample for patient 3 is sequenced in a different batch from the "before" sample, then the difference between the two samples for this patient will have contributions from both the treatment and batch effects. The latter will confound any attempt to extract information about the former.

Instead, the patient 3 samples will only provide information about the batch effect. The treatment effect can be estimated from the differences between the before and after samples of the other patients. This means that the batch effect can be estimated by subtracting the estimated treatment effect from the before/after difference of patient 3. I don't think that knowing the size of the batch effect is particularly useful, given that you're interested in looking at the effect of the treatment.

It's important to note that the final estimate of the batch effect will have no effect on the estimate of the treatment effect. The former depends on the latter, but not the other way around. Thus, you're not really "correcting" for the batch effect when you estimate the treatment in this altered design. In fact, you could drop the patient 3 samples altogether and get the same treatment estimates.

ADD REPLY • link 9.5 years ago Aaron Lun ★ 28k

0

Entering edit mode

~~You can't put both batch and patient into your design at the same time, since they are confounded with each other. This means that you cannot separate out the patient effects from the batch effects.~~ (Never mind, I didn't realize you had changed the batch variable.)

ADD REPLY • link 9.5 years ago Ryan C. Thompson ★ 7.9k

score 0 · Answer 4 · 2015-06-07

0

Entering edit mode

dongamit123 ▴ 20

@dongamit123-8091

Last seen 9.4 years ago

United States

Thank you for the detailed response. It makes sense now. Thanks!

ADD COMMENT • link 9.5 years ago dongamit123 ▴ 20