Hi,
I need some help in building the results tables. I have a multi-factor design i.e, 3 seedlots (Org, Hol, Ves), 4 cultivars (C1, C2, C3, C4) and 3 temperature points (K, P1, P2) and 2 biological replicates(72 samples in total). "Org" and "K" are reference levels for seed lot & temperature respectively. Here is the col data. ID is the grouped variable of seedlot, time and temperature
ID seedlot cultivar temp
OC1_K Org C1 K
OC1_P1 Org C1 P1
OC1_P2 Org C1 P2
OC2_K Org C2 K
OC2_P1 Org C2 P1
OC2_P2 Org C2 P2
OC3_K Org C3 K
OC3_P1 Org C3 P1
OC3_P2 Org C3 P2
OC4_K Org C4 K
OC4_P1 Org C4 P1
OC4_P2 Org C4 P2
HC1_K Hol C1 K
HC1_P1 Hol C1 P1
HC1_P2 Hol C1 P2
HC2_K Hol C2 K
HC2_P1 Hol C2 P1
HC2_P2 Hol C2 P2
HC3_K Hol C3 K
HC3_P1 Hol C3 P1
HC3_P2 Hol C3 P2
HC4_K Hol C4 K
HC4_P1 Hol C4 P1
HC4_P2 Hol C4 P2
VC1_K Ves C1 K
VC1_P1 Ves C1 P1
VC1_P2 Ves C1 P2
VC2_K Ves C2 K
VC2_P1 Ves C2 P1
VC2_P2 Ves C2 P2
VC3_K Ves C3 K
VC3_P1 Ves C3 P1
VC3_P2 Ves C3 P2
VC4_K Ves C4 K
VC4_P1 Ves C4 P1
VC4_P2 Ves C4 P2
What I did so far
DEsl <-DESeqDataSetFromMatrix(countData= round(ordta), colData = mdta, design = ~ 1)
keep <- rowSums(counts(DEsl) >= 12) >= 2
DEsl <- DEsl[keep,]
estimateSizeFactors(DEsl)
1) For genes with DE over temperature across seedlots and cultivars
DEsl$sc<- factor(paste0(DEsl$seedlot, DEsl$cultivar))
design(DEsl)<- ~ sc + temp +sc:temp
dds<-DESeq(DEsl, reduced= ~ sc+temp , test= "LRT")
2) For genes responding to temp
design(DEsl)<- ~ sc + temp
dds<-DESeq(DEsl, reduced= ~ sc , test= "LRT")
3)To extract DE results of all interesting pairwise comparisons of a cultivar (within cultivar)
design$group<- DEsl$id
design(DEsl)<- ~ 0+group
dds<-DESeq(DEsl)
resultsNames(dds)
"groupHC1_K" "groupHC1_P1" "groupHC1_P2" "groupHC2_K" "groupHC2_P1" "groupHC2_P2"
"groupHC3_K" "groupHC3_P1" "groupHC3_P2" "groupHC4_K" "groupHC4_P1" "groupHC4_P2"
......
resOC1_1 <-results(dds, contrast= c("group","OC1_P1","OC1_K"))
resOC1_2<-results(dds, contrast= c("group","OC1_P2","OC1_P1"))
resHC2_1<-results(dds, contrast= c("group","HC2_P1","HC2_K"))
...
Correct me if i am wrong in any of the above.
I would like to find genes with DE of a cultivar at a temp between seedlots eg: test DE genes between HC1_P1 and OC1_P1 with HC1_K and OC1_K as reference levels respectively. How do i set up a contrast matrix for this ?
results(dds, contrast= list(c("groupHC1_P1" - "groupHC1_K") , c("groupOC1_P1" - "groupOC1_K") ) ?
Hi, After reading about linear models I think that the following design would let me extract the results I need
is this approach right to get this contrast? : - results(dds, contrast= list(c("groupHC1_P1" - "groupHC1_K") , c("groupOC1_P1" - "groupOC1_K") )
Assuming that this is right, I was expecting to see similar results between the following two, but they vary a little bit is this difference expected?
Again, I'll recommend you take questions about statistical design of your experiment to a statistical collaborator. I just don't have sufficient time to field statistical consulting questions on the support site, I have to restrict to software related issues.