Question

edgeR with Successive Differences Contrast Coding?

0

Entering edit mode

Hannes • 0

@77e8bce9

Last seen 12 months ago

United Kingdom

Hi there,

I am wondering whether it is OK to use edgeR's glmQLFTest on a model matrix with "successive differences contrast coding" as generated by MASS::contr.sdif. If that is not that case, what would be a good way to test the questions outlined below?

My design
There are three categorial covariates: 'tissue' (7 levels), 'RNase' treatment (2 levels yes/no), and 'tag' (2 levels yes/no). I want to account for the variation explained by 'tissue', but I don' care so much about coefficients (would be a random effect if that was possible).

What's interesting to me is whether there is differential expression between "tag no, RNase no" and "tag yes, RNase no". I also want to know if there's differential expression between "tag yes, RNase no" and "tag yes, RNase yes".

So, I have combined 'RNase' and 'tag' into one 4-level factor called 'treatFact' with ordered levels: "tag no, RNase no" < "tag yes, RNase no" < "tag yes, RNase yes" < "tag no, RNase yes". I then generated a design matrix like this:

design <- model.matrix(~ tiss + treatFact, data=sampInfo, contrasts.arg = list(treatFact=MASS::contr.sdif))
head(design)
#   (Intercept)     tiss2         tiss3      tiss4     tiss5    tiss6      tiss7 treatFact2-1 treatFact3-2 treatFact4-3
# 1           1         0             0          0         0        0          0         0.25         -0.5        -0.25
# 2           1         0             0          0         0        0          0         0.25         -0.5        -0.25
# 3           1         0             0          0         0        0          0         0.25         -0.5        -0.25
# 4           1         0             0          0         0        0          0        -0.75         -0.5        -0.25
# 5           1         0             0          0         0        0          0        -0.75         -0.5        -0.25
# 6           1         0             0          0         0        0          0        -0.75         -0.5        -0.25

So, the contrasts treatFact2-1 and treatFact3-2 are what I am interested in.

My concern
The 'treatFact' columns of the design matrix are not independent of one another. Does an F-test as done by glmQLFTest make sense in this situation? With R's ordinary (treament) contrast coding, there would be only 1s and 0s in the model matrix. If one column of such a matrix is taken away, the individuals/samples with 1s in that column would then contribute more to the intercept, but I am not sure this works the same way with successive differences.

NB, I have seen the instructions in the edgeR user guide for setting up custom contrasts. But I think (correct me if I am wrong) this does not help me here as I am interested in treatFact2-1 and treatFact3-2 over all levels of 'tissue'.

Many thanks, Hannes

hypothesisTesting contrastCoding edger ExperimentalDesign expressionDifferences • 636 views

ADD COMMENT • link updated 12 months ago by Gordon Smyth 50k • written 12 months ago by Hannes • 0

score 2 · Accepted Answer · 2023-04-26

edgeR works with any contrast coding. There's no problem at all, columns of the design matrix are never assumed to be independent of one another. Indeed you will get exactly the same results from edgeR for the same comparisons whether you use custom contrasts as described in the User's Guide or you use contr.dif.

Just as an aside, there would be no advantage in treating tissue as random for a balanced design like this. The additive linear model is fine.