Pairwise comparisons for "both between and within subject"experiment using DESeq2
1
0
Entering edit mode
@ricardo-rodrigues-18386
Last seen 6.1 years ago
USA

 

Hi,

I have questions regarding pairwise comparisons for "both between and within subject" RNA-seq experiments analyzed using DESeq2.

My experiment comprises of subjects within block randomly assigned to one of three environmental treatments: 1) CT - control environment condition (n=6), 2) EA - environmental condition A (n=6), and 3) EB - environmental condition B (n=5), in which samples were collected from all subjects at day 14 and 22 of the experiment, so a total of 34 unique samples derived from 17 experimental units.

The experiment was analyzed similarly to what is proposed in the DESeq2 vignette for “Group-specific condition effects, individuals nested within groups” coupled with “Levels without samples”. Below is the source code:

## Frame matrix as needed
block <- factor(rep(c(5,1,2,6,4,3,1,4,5,3,6,2,6,2,4,1,5),each=2))
trt <- factor(c(rep("CT",12),rep("EA",12),rep("EB",10)))
time <- factor(rep(c(14,22)))
coldata <- data.frame(block, trt, time)
coldata
  block trt time
1      5  CT   14
2      5  CT   22
3      1  CT   14
4      1  CT   22
...
13     1  EA   14
14     1  EA   22
...
17     5  EA   14
18     5  EA   22
...
31     1  EB   14
32     1  EB   22
33     5  EB   14
34     5  EB   22

## Model matrix for full model (trt HS is missing a experimental unit, so delete a column that sum=0)
m1 <- model.matrix(~trt+block:trt+trt:time, coldata)
all.zero <- apply(m1, 2, function(x) all(x==0))
all.zero
(Intercept)        trtEA        trtEB trtCT:block2 trtEA:block2 trtEB:block2 trtCT:block3 
       FALSE        FALSE        FALSE        FALSE        FALSE        FALSE        FALSE 
trtEA:block3 trtEB:block3 trtCT:block4 trtEA:block4 trtEB:block4 trtCT:block5 trtEA:block5 
       FALSE         TRUE        FALSE        FALSE        FALSE        FALSE        FALSE 
trtEB:block5 trtCT:block6 trtEA:block6 trtEB:block6 trtCT:time22 trtEA:time22 trtEB:time22 
       FALSE        FALSE        FALSE        FALSE        FALSE        FALSE        FALSE 
m1 <- m1[,-which(all.zero)]

## Model matrix for reduced model (trt HS is missing a experimental unit, so delete a column that sum=0)
m2 <- model.matrix(~trt+block:trt, coldata)
all.zero <- apply(m2, 2, function(x) all(x==0))
all.zero
 (Intercept)        trtEA        trtEB trtCT:block2 trtEA:block2 trtEB:block2 trtCT:block3 
       FALSE        FALSE        FALSE        FALSE        FALSE        FALSE        FALSE 
trtEA:block3 trtEB:block3 trtCT:block4 trtEA:block4 trtEB:block4 trtCT:block5 trtEA:block5 
       FALSE         TRUE        FALSE        FALSE        FALSE        FALSE        FALSE 
trtEB:block5 trtCT:block6 trtEA:block6 trtEB:block6 
       FALSE        FALSE        FALSE        FALSE 
m2 <- m2[,-which(all.zero)]

## Model design
dds <- DESeqDataSetFromMatrix(countData=rcount, colData=coldata, design=m1)
## design: contains the full statistical model

## Filtering for genes that non-normalized sum of expression is >=1 in at least 5 samples
keep <- rowSums(counts(dds)>=1)>=5
table(keep)
dds <- dds[keep,]

## Differential expression analysis
dds <- DESeq(dds, test="LRT", reduced=m2)
## reduced: contains the reduced statistical model, which doesn't contain the variable of interest

## Name of columns available as results
resultsNames(dds)
 [1] "Intercept"    "trtEA"        "trtEB"        "trtCT.block2" "trtEA.block2" "trtEB.block2"
 [7] "trtCT.block3" "trtEA.block3" "trtCT.block4" "trtEA.block4" "trtEB.block4" "trtCT.block5"
[13] "trtEA.block5" "trtEB.block5" "trtCT.block6" "trtEA.block6" "trtEB.block6" "trtCT.time22"
[19] "trtEA.time22" "trtEB.time22"

## Build matrix of results
res_EAvsCT_d14 <- results(dds, test="Wald", name=("trtEA"))
res_EBvsCT_d14 <- results(dds, test="Wald", name=("trtEB"))
res_EBvsEA_d14 <- results(dds, test="Wald", contrast=list("trtEB","trtEA"))
res_EAvsCT_d22 <- results(dds, test="Wald", contrast=list("trtEA.time22","trtCT.time22"))
res_EBvsCT_d22 <- results(dds, test="Wald", contrast=list("trtEB.time22","trtCT.time22"))
res_EAvsEB_d22 <- results(dds, test="Wald", contrast=list("trtEB.time22","trtEA.time22"))
res_d22vs14_CT <- results(dds, test="Wald", name=("trtCT.time22"))
res_d22vs14_EA <- results(dds, test="Wald", contrast=list("trtEA.time22","trtEA"))
res_d22vs14_EB <- results(dds, test="Wald", contrast=list("trtEB.time22","trtEB"))

I am interested in the following pairwise comparisons:

  1. EB vs EA on day 14
  2. EA vs CT on day 22
  3. EB vs CT on day 22
  4. EB vs EA on day 22
  5. Day 22 vs 14 for treatment CT
  6. Day 22 vs 14 for treatment EA
  7. Day 22 vs 14 for treatment EB

I read the vignettes() and ?results for DESeq2, and also looked online for questions posted in forums that could help me understand if the contrast terms I am using are adequately calling the comparisons I am interested in, but, so far, I haven’t quite understood it. So, my questions are:

  1. Are my result contrast terms adequately calling the comparisons I am interested in?
  2. Do all contrasts of interest account for “CT on day 14” baseline? For example, in “EB vs EA on day 14”, is the actual comparison “EB – EA” or “(EB – CT on day14) – (EA – CT on day 14)”?

 

I appreciate any help very much.
Thanks,
Ricardo Rodrigues

 

pairwise deseq2 extracting contrasts resultsnames results deseq2 • 1.1k views
ADD COMMENT
0
Entering edit mode
@mikelove
Last seen 1 day ago
United States

Hi Ricardo,

You can make comparisons 5-7 using results() with the “name” argument and the interaction term, eg "trtCT.time22” for the 22 vs 14 effect for control. The comparison 1-4 are not possible with this design, because you can’t control for the block/subject with fixed effects and compare across groups with different blocks/subjects. You could take an alternative approach with limma voom and the duplicateCorrelation() function, but you can’t make these comparisons with fixed effects models.

ADD COMMENT
0
Entering edit mode

Hi Michael,

Thank you very much for your help.

I am interested in treatment comparison within each timepoint, as mentioned above, so I'm thinking in analyzing timepoints separately. Thus, I plan to use the following models: full = ~block + trt and reduced = ~block. Would it be a reasonable approach? Any suggestions?

Using the proposed models, I get:

resultsNames(dds)
[1] "Intercept"    "blk_2_vs_1"   "blk_3_vs_1"   "blk_4_vs_1"   "blk_5_vs_1"   "blk_6_vs_1"   "trt_EA_vs_CT"
[8] "trt_EB_vs_CT"

For the comparison "EB vs EA", I am using: results(dds, test="Wald", contrast=c("trt", "EB","EA")), although I am wondering if this approach generates the similar results to those of relevel(trt, "EA"). If I am to use the relevel option, do I need to re-run the whole model after releveling or I can I relevel only at the time of extracting the comparison of interest?

Thank you,
Ricardo

ADD REPLY
0
Entering edit mode

You should drop "block" from the full and reduced models here. You don't have repeated measurements for the samples when you subset to a single timepoint. It should just be ~trt. And yes you can use contrast to compare EB and EA, you do not need to relevel and re-run DESeq().

ADD REPLY

Login before adding your answer.

Traffic: 966 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6