Question

EdgeR. Analysis with pairing across condition and treatment. Design Matrix

0

Entering edit mode

chwhite • 0

@chwhite-7316

Last seen 9.2 years ago

United States

I have the following experimental design and wanted to check the construction of the design matrix as well as a few other questions about setting it up.

Patient<-c("P1","P1","P1","P2","P2","P2","P3","P3","P3","P4","P4","P4","P1","P1","P1","P2","P2","P2","P3","P3","P3","P4","P4","P4")
Condition<-c("uninf","uninf","uninf","uninf","uninf","uninf","uninf","uninf","uninf","uninf","uninf","uninf","inf","inf","inf","inf","inf","inf","inf","inf","inf","inf","inf","inf")
Treatment<-c("D","S","R","D","S","R","D","S","R","D","S","R","D","S","R","D","S","R","D","S","R","D","S","R")
Exp_Design<-data.frame(cbind(Patient,Condition,Treatment))

From the edgeR user guide, an example that had pairing across treatment but not across Condition (Section 3.5) used the following design matrix.

design<-model.matrix(~Condition+Condition:Patient+Condition:Treatment)

In my case though, I have pairing across treatment and across condition so would the proper design matrix be the following.

design2<-model.matrix(~Condition+Patient+Condition:Treatment)

My second question. The intercept has created confusion in my lab whenever I have to try to explain it to someone.
I tried to remove the intercept by using the following.

design3<-model.matrix(~0+Condition+Patient+Condition:Treatment)

This creates one with both conditions and I can create a contrast to get the difference between conditions, but I'm missing a patient and a Condition:Treatment.
I don't believe I have un-paramaterized the data properly. How would I go about setting that up?

Final question. With this experimental setup, would I be better served by running several paired comparisons
where I separate the data based upon which condition (inf,uninf) I'm looking at?
Example.

keep<-which(Condition%in%"inf")
Pat2<-Patient[keep]
Cond2<-Condition[keep]
Treat2<-Treatment[keep]

Exp_design_inf<-cbind(Pat2,Cond2,Treat2)
design_inf<-model.matrix(~Pat2+Treat2)

And then repeat that procedure with the uninf condition.
If this is the route to take, how would I create the design matrix with pairing but without paramaterizing using the intercept?

edger ANOVA Design matrix • 3.7k views

ADD COMMENT • link updated 9.2 years ago by Gordon Smyth 50k • written 9.2 years ago by chwhite • 0

0

Entering edit mode

The easiest way to define the design matrix depends partly on what comparisons you want to make. What comparisons between Conditions and Treatments do you want to make?

ADD REPLY • link 9.2 years ago Gordon Smyth 50k

0

Entering edit mode

Thanks for the information.

I am interested in comparing the effect of the treatment (D vs. S, D vs. R, S vs. R) in both inf and uninf conditions. I am also interested in the differences between the inf and uninf conditions under the effect of all 3 treatments.

The reasons it is a paired design across the condition is each patient has their sample collected and then half of that sample in infected with a virus and the other half is left alone. After that, each sample is treated with the solvent, D, or two drugs of interest.

ADD REPLY • link 9.2 years ago chwhite • 0

score 2 · Accepted Answer · 2015-02-01

2

Entering edit mode

Gordon Smyth 50k

@gordon-smyth

Last seen 12 minutes ago

WEHI, Melbourne, Australia

Your design is a blocked design (a generalization of a paired design) and is covered by Section 3.4.2 in the edgeR User's Guide.

Thanks for clarifying the comparisons you want to make. Block designs are easy to analyse. You simply add +Patient to the model formula, like this:

design <- model.matrix(~ XXXX + Patient)

where XXXX is whatever you would have done without the blocking variable.

In your case, you have two treatment factors with all combinations of the factor levels appearing in your experiment. So you can follow Section 3.3.1 of the edgeR User's Guide to set up a combined factor for Treatment and Condition, and then just add +Patient. Pretty simple when you get the idea!

ADD COMMENT • link 9.2 years ago Gordon Smyth 50k

0

Entering edit mode

Thanks, and answered above. I had read the blocking effect but was unsure how to insert the Condition information into the design. If I separated the two conditions, then I would use the following.

design(~Patient+Treatment)

I was also confused on how to set this up as in section 3.3.1 without the intercept as that was conceptually easier to understand and explain to others.

ADD REPLY • link 9.2 years ago chwhite • 0

0

Entering edit mode

If you want to remove the intercept, you have use ~0+Treatment+Patient and not ~0+Patient+Treatment. In other words, follow the advice I gave you exactly.

There isn't any reason to separate the two conditions. It is better to analyze all the data together.

ADD REPLY • link 9.2 years ago Gordon Smyth 50k

0

Entering edit mode

Will do. One last thing. I noticed that when I use ~0+Treatment+Patient, it contains all of the Treatments but it is missing the first patient. Since I'm not using the subtraction design, do I need all Patients in this. A work around I played with was to add in a 0 level when generating the Patient factor.

Patient<-factor(Experimental_Design$Patient,levels=c(0,1,2,3,4))

When using design with this, all Patients show up. Is this correct approach or should I let model.matrix remove the first patient.

Edit. When trying this it wouldn't actually estimate the dispersions as the coefficient for patient 4 was not estimable. So, I guess leave the levels as 1, 2, 3, and 4.

ADD REPLY • link 9.2 years ago chwhite • 0

0

Entering edit mode

One cannot include a coefficient for all the patients -- it would produce an over-parametrized model. There may be four patients, but there are only three differences between the patients, so there can only be three coefficients in the linear model. This isn't something you need to worry about anyway because you should not be testing contrasts between the patients. Just let the software do the right thing for you.