Question

DEseq2 extracting results from multifactor design

0

Entering edit mode

Akhil • 0

@3a5eb334

Last seen 3.8 years ago

Norway

Hi,

I need some help in building the results tables. I have a multi-factor design i.e, 3 seedlots (Org, Hol, Ves), 4 cultivars (C1, C2, C3, C4) and 3 temperature points (K, P1, P2) and 2 biological replicates(72 samples in total). "Org" and "K" are reference levels for seed lot & temperature respectively. Here is the col data. ID is the grouped variable of seedlot, time and temperature

 ID seedlot cultivar temp
OC1_K    Org   C1   K
OC1_P1   Org   C1   P1
OC1_P2   Org   C1   P2 
OC2_K    Org   C2   K
OC2_P1   Org   C2   P1
OC2_P2   Org   C2   P2 
OC3_K    Org   C3   K
OC3_P1   Org   C3   P1
OC3_P2   Org   C3   P2
OC4_K    Org   C4   K
OC4_P1   Org   C4   P1
OC4_P2   Org   C4   P2
HC1_K    Hol   C1   K
HC1_P1   Hol   C1   P1
HC1_P2   Hol   C1   P2 
HC2_K    Hol   C2   K
HC2_P1   Hol   C2   P1
HC2_P2   Hol   C2   P2
HC3_K    Hol   C3   K
HC3_P1   Hol   C3   P1
HC3_P2   Hol   C3   P2
HC4_K    Hol   C4   K
HC4_P1   Hol   C4   P1
HC4_P2   Hol   C4   P2
VC1_K    Ves   C1   K
VC1_P1   Ves   C1   P1
VC1_P2   Ves   C1   P2 
VC2_K    Ves   C2   K
VC2_P1   Ves   C2   P1
VC2_P2   Ves   C2   P2
VC3_K    Ves   C3   K
VC3_P1   Ves   C3   P1
VC3_P2   Ves   C3   P2
VC4_K    Ves   C4   K
VC4_P1   Ves   C4   P1
VC4_P2   Ves   C4   P2

What I did so far

DEsl <-DESeqDataSetFromMatrix(countData= round(ordta), colData = mdta, design = ~ 1)

keep <- rowSums(counts(DEsl) >= 12) >= 2 
DEsl <- DEsl[keep,]
estimateSizeFactors(DEsl)

1) For genes with DE over temperature across seedlots and cultivars 
DEsl$sc<- factor(paste0(DEsl$seedlot, DEsl$cultivar))
design(DEsl)<- ~ sc + temp +sc:temp
dds<-DESeq(DEsl, reduced= ~ sc+temp , test= "LRT")

2) For genes responding to temp
design(DEsl)<- ~ sc + temp 
dds<-DESeq(DEsl, reduced= ~ sc , test= "LRT")

3)To extract DE results of all interesting pairwise comparisons of a cultivar (within cultivar) 
design$group<- DEsl$id
design(DEsl)<- ~ 0+group
dds<-DESeq(DEsl)
resultsNames(dds)
 "groupHC1_K"  "groupHC1_P1" "groupHC1_P2" "groupHC2_K"  "groupHC2_P1" "groupHC2_P2"
 "groupHC3_K"  "groupHC3_P1" "groupHC3_P2" "groupHC4_K"  "groupHC4_P1" "groupHC4_P2"
  ......
resOC1_1 <-results(dds, contrast= c("group","OC1_P1","OC1_K"))
resOC1_2<-results(dds, contrast= c("group","OC1_P2","OC1_P1"))
resHC2_1<-results(dds, contrast= c("group","HC2_P1","HC2_K"))
...

Correct me if i am wrong in any of the above.

I would like to find genes with DE of a cultivar at a temp between seedlots eg: test DE genes between HC1_P1 and OC1_P1 with HC1_K and OC1_K as reference levels respectively. How do i set up a contrast matrix for this ?

results(dds, contrast= list(c("groupHC1_P1" - "groupHC1_K") , c("groupOC1_P1" - "groupOC1_K") ) ?

timecourse DESeq2 RNASeq • 1.4k views

ADD COMMENT • link updated 3.8 years ago by Michael Love 43k • written 3.9 years ago by Akhil • 0

score 0 · Answer 1 · 2021-08-31

0

Entering edit mode

Michael Love 43k

@mikelove

Last seen 16 days ago

United States

Unfortunately, I have to restrict my time to software related questions here. Your question is one of how to set up the statistical analysis and interpret the results, and for that I recommend working with a statistical collaborator at your institute, or someone familiar with linear models in R.

ADD COMMENT • link 3.9 years ago Michael Love 43k

0

Entering edit mode

Hi, After reading about linear models I think that the following design would let me extract the results I need

design(DEsl)<- ~0+sc+sc:temp
dds_nes<-DESeq(DEsl)
resultsNames(dds_nes)
 [1] "scHC1"        "scHC2"        "scC3"        "scHC4"        "scOC1"        "scOC2"       
 [7] "scOC3"        "scOC4"        "scVC1"        "scVC2"        "scVC3"        "scVC4"       
[13] "scHC1.tempP1" "scHC2.tempP1" "scHC3.tempP1" "scHC4.tempP1" "scOC1.tempP1" "scOC2.tempP1"
[19] "scOC3.tempP1" "scOC4.tempP1" "scVC1.tempP1" "scVC2.tempP1" "scVC3.tempP1" "scVC4.tempP1"
[25] "scHC1.tempP2" "scHC2.tempP2" "scHC3.tempP2" "scHC4.tempP2" "scOC1.tempP2" "scOC2.tempP2"
[31] "scOC3.tempP2" "scOC4.tempP2" "scVC1.tempP2" "scVC2.tempP2" "scVC3.tempP2" "scVC4.tempP2"

res<-results(dds_nes, contrast= list("scHC1.tempP1","scOC1.tempP1")

is this approach right to get this contrast? : - results(dds, contrast= list(c("groupHC1_P1" - "groupHC1_K") , c("groupOC1_P1" - "groupOC1_K") )

Assuming that this is right, I was expecting to see similar results between the following two, but they vary a little bit is this difference expected?

resOC1_1 <-results(dds, contrast= c("group","OC1_P1","OC1_K"), alpha = 0.05, lfcThreshold = log2(1.5))
summary(resOC1_1)

out of 354034 with nonzero total read count
adjusted p-value < 0.05
LFC > 0.58 (up)    : 1447, 0.41%
LFC < -0.58 (down) : 223, 0.063%
outliers [1]       : 0, 0%
low counts [2]     : 219645, 62%
(mean count < 12)
[1] see 'cooksCutoff' argument of ?results
[2] see 'independentFiltering' argument of ?results

res_nesOC1_1<- results(dds_nes, name="scOC1.tempP1", alpha = 0.05, lfcThreshold = log2(1.5))
summary(res_nesOC1_1)
out of 354034 with nonzero total read count
adjusted p-value < 0.05
LFC > 0.58 (up)    : 1393, 0.39%
LFC < -0.58 (down) : 222, 0.063%
outliers [1]       : 0, 0%
low counts [2]     : 219645, 62%
(mean count < 12)
[1] see 'cooksCutoff' argument of ?results
[2] see 'independentFiltering' argument of ?results

ADD REPLY • link 3.8 years ago Akhil • 0

0

Entering edit mode

Again, I'll recommend you take questions about statistical design of your experiment to a statistical collaborator. I just don't have sufficient time to field statistical consulting questions on the support site, I have to restrict to software related issues.

ADD REPLY • link 3.8 years ago Michael Love 43k