Question: Finding genes respond differently to treatment between many genotype comparison
0
14 months ago by
chiayi0
New York University
chiayi0 wrote:

Hello,

I have an RNA-seq experiment that I would like to analysis using nested design matrix, similar to section 3.5 in the edgeR User's Guide:
> design <- model.matrix(~Disease+Disease:Patient+Disease:Treatment) > colnames(design) [1] "(Intercept)"                      "DiseaseDisease1" [3] "DiseaseDisease2"                  "DiseaseHealthy:Patient2" [5] "DiseaseDisease1:Patient2"         "DiseaseDisease2:Patient2" [7] "DiseaseHealthy:Patient3"          "DiseaseDisease1:Patient3" [9] "DiseaseDisease2:Patient3"         "DiseaseHealthy:TreatmentHormone" [11] "DiseaseDisease1:TreatmentHormone" "DiseaseDisease2:TreatmentHormone"

My goal is to find genes that respond differently to the hormone in ANY disease pairs (disease1 vs healthy, disease2 vs healthy, and disease1 vs disease2). Although I could achieve each comparison using contrasts, as stated on page 39 of the User's Guide,
qlf <- glmQLFTest(fit, contrast=c(0,0,0,0,0,0,0,0,0,-1,1,0))
qlf <- glmQLFTest(fit, contrast=c(0,0,0,0,0,0,0,0,0,-1,0,1))
qlf <- glmQLFTest(fit, contrast=c(0,0,0,0,0,0,0,0,0,0,-1,1))

I was wondering if I could combine the above three into one glmQLFTest using either contrast or coefficient. In my data, I have 18 genotypes comparable to the disease group in the above example and I am interested in finding the genes respond differently to the treatment in ANY genotype comparison. This would result in 153 possible pair-wise comparisons and I'm curious to know a way better than conducting glmQLFTest 153 times.

>mydesign <- model.matrix(~genotype+genotype:treatment) >colnames(mydesign)  [1] "(Intercept)"           "genotype2"             "genotype3"             "genotype4"              [5] "genotype5"             "genotype6"             "genotype7"             "genotype8"              [9] "genotype9"             "genotype10"            "genotype11"            "genotype12"            [13] "genotype13"            "genotype14"            "genotype15"            "genotype16"            [17] "genotype17"            "genotype18"            "genotype1:treatmentH"  "genotype2:treatmentH"  [21] "genotype3:treatmentH"  "genotype4:treatmentH"  "genotype5:treatmentH"  "genotype6:treatmentH"  [25] "genotype7:treatmentH"  "genotype8:treatmentH"  "genotype9:treatmentH"  "genotype10:treatmentH" [29] "genotype11:treatmentH" "genotype12:treatmentH" "genotype13:treatmentH" "genotype14:treatmentH" [33] "genotype15:treatmentH" "genotype16:treatmentH" "genotype17:treatmentH" "genotype18:treatmentH"

Any help would be much appreciated!

modified 14 months ago by Gordon Smyth37k • written 14 months ago by chiayi0
Answer: Finding genes respond differently to treatment between many genotype comparison
3
14 months ago by
Aaron Lun24k
Cambridge, United Kingdom
Aaron Lun24k wrote:

Just set up an ANODEV. Using the design from Section 3.5 as an illustrative example:

qlf <- glmQLFTest(fit,
contrast=cbind(
c(0,0,0,0,0,0,0,0,0,-1,1,0),
c(0,0,0,0,0,0,0,0,0,-1,0,1)
)
)


In the example code above, DiseaseHealthy:TreatmentHormone serves as a reference, and each column of the contrast matrix compares one of the DiseaseDisease*:TreatmentHormone coefficients to the reference. The null hypothesis of each column is that the hormone effect in the corresponding disease is the same as that in the healthy individuals; the global null hypothesis across the entire contrast matrix is that the hormone effect is the same in the healthy state and in  both diseases. glmQLFTest will then test this global null, which will be rejected if there is DE between any pair of hormone effects.

From this, you can see that it doesn't really matter what you choose as the reference. If you choose A as a reference, the nulls are B == A and C == A, and the global null is A == B == C. You'll get the same global null regardless of what condition you pick to be the reference. Similarly, there's no need to specify all pairwise comparisons, as B == C is implied from the above.

Adapting the above reasoning to mydesign, you can build your contrast matrix as:

con.mat <- matrix(0, 36, 17)
con.mat[19,] <- -1 # first interaction term is the reference
con.mat[cbind(2:18+18, 1:17)] <- 1 # comparing to each other term

... for use in the contrasts= argument of glmQLFit.

ADD COMMENTlink modified 14 months ago • written 14 months ago by Aaron Lun24k

Aaron,

Thank you so much for the detailed clarification. This certainly helped me better understand the contrast and ANODEV. I really appreciate it!

Answer: Finding genes respond differently to treatment between many genotype comparison
2
14 months ago by
Gordon Smyth37k
Walter and Eliza Hall Institute of Medical Research, Melbourne, Australia
Gordon Smyth37k wrote:

Aaron has explained that the edgeR will conduct an anova-type test (analysis of deviance in this case) if you give it more than one coefficient or more than one contrast. In your case, you are looking for genes that show a genotype x treatment interaction. The easiest way to do this test for your data is to define

design <- model.matrix(~genotype*treatment)


and then later use

qlf <- glmQLFTest(fit, coef=20:36)

This will find genes that respond differently to the treatment in any genotype.

In this case you don't need contrasts at all, because the interactions are already part of the fitted model. The test uses 17 coefficients because the interactions have (18-1) x (2-1) = 17 degrees of freedom.

There are lots of equivalent ways to define the same test in edgeR. The above code is just the shortest route.

ADD COMMENTlink modified 14 months ago • written 14 months ago by Gordon Smyth37k

Gordon,

Thank you for the answer. I'm excited about further understanding edgeR and design matrix. I really appreciate your time and input.

Content
Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.