Dear All!
I have the following setup for an RNA-seq experiment: 4 individuals, 2 of them have a disease, 2 are healthy controls, cells from each individual were cultured using two different conditions. So, the setup is very similar to the one described in the edgeR user guide (Section 3.5) or in the DESeq2 vignette (Section 3.12.1). Following the edgeR user guide I generated the design matrix (CTRL
are the healthy controls, DIS
the patients, condition A
and B
code for the two different cell culture conditions):
colD <- data.frame(type=rep(c("DIS", "CTRL"), each=4), condition=rep(c("A", "B"), 4), individual=rep(1:4, each=2), stringsAsFactors=TRUE) ## add the nested individual following the edgeR user guide colD <- cbind(colD, nind=factor(rep(rep(1:2, each=2), 2))) colD type condition individual nind 1 DIS A 1 1 2 DIS B 1 1 3 DIS A 2 2 4 DIS B 2 2 5 CTRL A 3 1 6 CTRL B 3 1 7 CTRL A 4 2 8 CTRL B 4 2 ## the model matrix for the experiment model.matrix(~type + type:nind + type:condition, colD) (Intercept) typeDIS typeCTRL:nind2 typeDIS:nind2 typeCTRL:conditionB 1 1 1 0 0 0 2 1 1 0 0 0 3 1 1 0 1 0 4 1 1 0 1 0 5 1 0 0 0 0 6 1 0 0 0 1 7 1 0 1 0 0 8 1 0 1 0 1 typeDIS:conditionB 1 0 2 1 3 0 4 1 5 0 6 0 7 0 8 0 attr(,"assign") [1] 0 1 2 2 3 3 attr(,"contrasts") attr(,"contrasts")$type [1] "contr.treatment" attr(,"contrasts")$nind [1] "contr.treatment" attr(,"contrasts")$condition [1] "contr.treatment"
The questions I want to answer in that RNA-Seq experiment are:
1) genes differentially expressed between DIS and CTRL, both cultured at condition A.
2) genes differentially expressed in DIS between B and A.
3) genes differentially expressed in CTRL between B and A.
4) genes differentially regulated between (DIS, B vs A) and (CTRL, B vs A).
Now, to answer question 2) I can extract coefficient "typeDIS:conditionB"
, to answer 3 "typeCTRL:conditionB"
and the contrast between both I can extract with results(..., contrast=c(0, 0, 0, 0, -1, 1))
. It is however not clear to me how to answer question 1), i.e. the difference between DIS
cells and condition A
and CTRL
cells and condition A
. I first thought that I can use "typeDIS"
but to me that looks rather like a comparison between all DIS
and all CTRL
samples (so conditions A
and B
pooled).
How can I answer question 1) using the design above?
thanks in advance for any comments and answers!
jo
There's also an additional subtlety of the individual effect, which confounds the interpretation of the
typeDIS
term. Consider an example where both samples of individual 1 are highly expressed, while both samples of individual 2 are lowly expressed. Assume that the average expression of individuals 1 and 2 is equal to the average expression of individuals 3 and 4, i.e., no disease effect for condition A (or B). For simplicity, assume that both samples for the same individual have the same expression. Upon fitting, one will find thattypeDIS
is non-zero and positive whiletypeDIS:nind2
is large and negative (all other non-intercept coefficients are zero). For example, if we assign the matrix in the original post todesign
, we can get:Treating
typeDIS
as the disease effect for condition A would suggest a difference where there is none. Indeed, the example in Section 3.5 in the user's guide never compares samples directly between disease states, but instead compares the log-fold changes (similar to question 4 in the original post). If we were to compare directly between diseases, I would probably usevoom
with a~type*condition
model and block onindividual
withduplicateCorrelation
.Thanks Aaron!
There is no way that I could get that working in
DESeq2
using e.g. a linear combination of coefficients (e.g.contrast=c(-0.5, 0.5, -0.5, 0.5, 1, -1)
)?Thanks! Absolutely makes sense. Nice explanation!
However (also regarding Aaron's comment below),
typeDis
is in fact Disease, conditionA in individual 1, right? So that way, i.e. with the present pairing setup, there is no way I can directly compare Disease vs Control. However, if I didn't have the pairing I could in fact extract the difference Dis vs Ctrl, both condition A fromtypeDis
if I got it correctly.Exactly. I didn't think things all the way through - as Aaron notes, there are some things you can't really get at with this experimental design in the context of a generalized linear model.