Question: edgeR design matrix and contrasts: how to make contrast between groups that aren't shown in the design matrix columns?
gravatar for cafelumiere12
22 months ago by
United States
cafelumiere1220 wrote:

Hi all -

I feel like I might have missed some important points in constructing design matrix for a slightly more complicated experimental design - so I have 6 different animals, each treated with with Dose_A, Dose_B, or Control. The samples were collected at 2hr or 6hr as shown in the table below:

SampleBarcode AnimalNumber AnimalID TimePoint_hr Dose
1001 1 9094 2 Control
1002 2 9093 2 Control
1003 3 9099 2 Control
1004 1 9112 6 Control
1005 2 9078 6 Control
1006 3 9077 6 Control
1007 1 9094 2 Dose_A
1008 2 9093 2 Dose_A
1009 3 9099 2 Dose_A
1010 1 9112 6 Dose_A
1011 2 9078 6 Dose_A
1012 3 9077 6 Dose_A
1013 1 9094 2 Dose_B
1014 2 9093 2 Dose_B
1015 3 9099 2 Dose_B
1016 1 9112 6 Dose_B
1017 2 9078 6 Dose_B
1018 3 9077 6 Dose_B

I'm trying to get the differences in gene expression between the following groups, and I'm especially having trouble with the highlighted ones:

  1. Samples treated with Dose_A vs. Dose_B at 2 hr
  2. Samples treated with Dose_A vs. Dose_B at 6 hr
  3. Samples treated with Dose_B vs Control at 2 hr
  4. Samples treated with Dose_B vs,Control at 6 hr

I tried following Section 3 of the manual, particularly 3.5, and added the above Animal Number to the sample manifest, make design matrix like this:

sampleInfo <- read_csv(<samplemanifest_csvfile>,col_names=TRUE)
Subject <- factor(sampleInfo$AnimalNumber)
Timepoint <- factor(sampleInfo$TimePoint_hr)
Dose <- factor(sampleInfo$Dose)
# Make design matrix following edgeR manual 3.5
design <- model.matrix(~Dose+Dose:Subject+Dose:Timepoint)
colnames(design) <- make.names(colnames(design))

When I look at the design matrix:

 [1] "X.Intercept."           "DoseDose_A"             "DoseDose_B"             "DoseControl.Subject2"  
 [5] "DoseDose_A.Subject2"    "DoseDose_B.Subject2"    "DoseControl.Subject3"   "DoseDose_A.Subject3"   
 [9] "DoseDose_B.Subject3"    "DoseControl.Timepoint6" "DoseDose_A.Timepoint6"  "DoseDose_B.Timepoint6"

So in order to get the comparison 

  1. Samples treated with Dose_A vs. Dose_B at 6 hr
  2. Samples treated with Dose_B vs.Control at 6 hr

Is the following contrast correct?

my.contrasts = makeContrasts(

Furthermore, how do I get the contrasts for:

  1. Samples treated with Dose_A vs. Dose_B at 2 hr
  2. Samples treated with Dose_B vs Control at 2 hr

 Since these two don't even exist in the design matrix?

thanks very much in advance!

ADD COMMENTlink modified 22 months ago by Gordon Smyth39k • written 22 months ago by cafelumiere1220
Answer: edgeR design matrix and contrasts que: how to make contrast between groups that
gravatar for Gordon Smyth
22 months ago by
Gordon Smyth39k
Walter and Eliza Hall Institute of Medical Research, Melbourne, Australia
Gordon Smyth39k wrote:

You are following Section 3.5 of the edgeR User's Guide, which gives advice on a complex type of design with multiple error levels, when you need to make comparisons between subjects as well as within subjects.

Your experiment is actually simpler than that. Your experiment is a classic blocked design in which all the animals are equivalent and all the treatments are applied to each animal. You only need to make comparisons within animals. So your experiment is actually like Section 3.4.2 of the User's Guide.

Your experiment is slightly more complicated than the example in Section 3.4.2 because you have two treatments (Timepoint and Dose) instead of just one. But you can easily convert it to be the same format:

Treatment <- paste(sampleInfo$Dose, sampleInfo$TimePoint_hr, sep=".")
Treatment <- factor(Treatment)
Animal <- factor(sampleInfo$AnimalID)
design <- model.matrix(~Animal+Treatment)

Then you can easily form any contrasts you like between the Treatment groups.


ADD COMMENTlink modified 22 months ago • written 22 months ago by Gordon Smyth39k

Thank you so much for your help !! 

One quick question: my time points are actually more complicated - instead of just 2 and 6 hr, there are: 2,6,12,24,48,96,168 hr. So when I used the above design matrix, it was the "12hr " not the "2hr" that was hidden as the control part of the design matrix column - i.e. none of the 12hr timepoint can be found in the design matrix. Is the best way to go around this just to make the design matrix this way instead? or is there better way to do it?

design <- model.matrix(~0+Treatment+Animal)

Also, may I ask is there any error in making the design matrix the original complicated way? i.e. is the analysis that I was able to do not valid?  Or is it simply that I can't do all the comparison that I wanted to do?

thanks very much again!


ADD REPLYlink modified 22 months ago • written 22 months ago by cafelumiere1220

For your first question; yes, a model without an intercept is perfectly fine. You just have to adjust your interpretation of the coefficients, as each Treatment* coefficient should now represent the average expression of the corresponding group. Contrasts between groups should then be easy to set up.

For the second question; blocking on Dose:Subject in your old matrix does not seem to be correct, assuming that AnimalID is the unique animal identifier for your six mice. This would suggest that, e.g., samples 1001 and 1004 come from the same animal, whereas the IDs indicate that they originate from two different animals.

Putting that aside, it is possible to formulate Gordon's model (using Treatment) with interaction terms (Dose:Timepoint, etc.) similar to what you have done in your original post. When done correctly, the two models should give exactly the same results. However, I would prefer to use a single Treatment factor as it is much simpler to interpret and understand, and thus less likely for a contrast to be misspecified; see the advice in Section 3.3. of the user's guide for an analogous situation.

ADD REPLYlink written 22 months ago by Aaron Lun25k

Thank you!!

ADD REPLYlink written 22 months ago by cafelumiere1220
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 184 users visited in the last hour