Question

Analyzing nested experimental conditions in DESeq2

0

Entering edit mode

cryptic • 0

@cryptic-20547

Last seen 4.7 years ago

I have searched on both bioconductor as well as biostars but haven't found very relevant results. My experimental conditions setup is as follows:

sample  strain  condition
samp01  WildType    time0
samp02  WildType    time0
samp03  WildType    time0
samp04  WildType    time1
samp05  WildType    time1
samp06  WildType    time1
samp07  WildType    time2
samp08  WildType    time2
samp09  WildType    time2
samp10  mutation_1  time0
samp11  mutation_1  time0
samp12  mutation_1  time0
samp13  mutation_1  time1
samp14  mutation_1  time1
samp15  mutation_1  time1
samp16  mutation_1  time2
samp17  mutation_1  time2
samp18  mutation_1  time2
samp19  mutation_2  time0
samp20  mutation_2  time0
samp21  mutation_2  time0
samp22  mutation_2  time1
samp23  mutation_2  time1
samp24  mutation_2  time1
samp25  mutation_2  time2
samp26  mutation_2  time2
samp27  mutation_2  time2

Essentially, WildType is my CONTROL, and mutation1 and mutation2 are experimental conditions. Each of these three strains were sampled at 3 time intervals each. Given this data, can I setup the analysis as follows?

data_all <- DESeqDataSetFromMatrix(countData=mycounts, colData=mypheno, design = ~ strain + condition + strain:condition)

deseq2 • 510 views

ADD COMMENT • link updated 5.0 years ago by Michael Love 41k • written 5.0 years ago by cryptic • 0

score 1 · Answer 1 · 2019-04-16

1

Entering edit mode

Michael Love 41k

@mikelove

Last seen 10 hours ago

United States

Yes, that's the design I would use depending on your experimental questions. Take a look at the DESeq2 workflow which has a time series example (only two conditions, but you can compare the code which is similar).

ADD COMMENT • link 5.0 years ago Michael Love 41k

0

Entering edit mode

Michael: Thanks for your swift response. I will take a look at the workflow on bioconductor. In the meantime, I went ahead and ran the model as above. As per resultsNames(data_all_deseq), I have 9 comparisons. But if I look at the results table, I see only one padj column. Thus, it looks like the p-value is being calculated over all conditions - which seems wrong.

So I probably need to use the contrast option to explicitly compare pairs of conditions. For example:

WildType_time0 vs mutation_1_time0

ADD REPLY • link 5.0 years ago cryptic • 0

0

Entering edit mode

I’d recommend working with a statistician if it’s your first time working with linear models. Or you can take at look at the workflow and vignette which are fairly extensive. I unfortunately don’t have too much time for statistical questions and have to reserve my time for software related questions.

ADD REPLY • link 5.0 years ago Michael Love 41k