DESeq2 time course confusion
1
0
Entering edit mode
rajpal22288 ▴ 10
@rajpal22288-22395
Last seen 4 weeks ago
Italy

I am trying to see the effect of different conditions at different time points, but after going through all the posts and vignettes I am confused.

This is where I am getting the error, that the model matrix is not full rank. After reading the posts I understood that I have to create another column with nested value, but I don't understand with which column? is it the time?

dds <- DESeqDataSetFromMatrix(htseq_data, sampleTable, design = ~ condition + time + condition:time )

And this is my sampleTable:

LIMS    Client  condition   time
SAMPLE-IN-POOL19-0050-R0001.counts  BU_243  PH  1
SAMPLE-IN-POOL19-0051-R0001.counts  BU_244  PH  1
SAMPLE-IN-POOL19-0052-R0001.counts  BU_245  PH  1
SAMPLE-IN-POOL19-0031-R0001.counts  BU_247  TCP 1
SAMPLE-IN-POOL19-0032-R0001.counts  BU_248  TCP 1
RNA20-0041-R0001.counts BU_249  TCP 1
SAMPLE-IN-POOL19-0039-R0001.counts  BU_258  CO  0
SAMPLE-IN-POOL19-0040-R0001.counts  BU_259  CO  0
SAMPLE-IN-POOL19-0041-R0001.counts  BU_260  CO  0
SAMPLE-IN-POOL19-0042-R0001.counts  BU_261  PH  6
SAMPLE-IN-POOL19-0043-R0001.counts  BU_263  PH  6
SAMPLE-IN-POOL19-0044-R0001.counts  BU_264  PH  6
SAMPLE-IN-POOL19-0045-R0001.counts  BU_265  TCP 6
SAMPLE-IN-POOL19-0046-R0001.counts  BU_266  TCP 6
RNA20-0043-R0001.counts BU_268_bis  TCP     6
SAMPLE-IN-POOL19-0035-R0001.counts  BU_252  PH   3
SAMPLE-IN-POOL19-0033-R0001.counts  BU_250  PH  3
SAMPLE-IN-POOL19-0034-R0001.counts  BU_251  PH  3
SAMPLE-IN-POOL19-0036-R0001.counts  BU_253  TCP 3
SAMPLE-IN-POOL19-0037-R0001.counts  BU_255  TCP 3
RNA20-0042-R0001.counts BU_254  TCP     3


Thank you

DESeq2 DifferentialExpression • 212 views
0
Entering edit mode

CO is nested with timepoint 0, so for the given design you would need to remove CO from the analysis.

0
Entering edit mode

But, is there any other way around? I wanted to see the difference in gene expression in all the time points of different conditions compared to control. Thank you

0
Entering edit mode

Then you probably (I guess) would need a full-factorial design, so combining the two columns, like condition_time, e.g. PH_1, TCP_1, CO_0 and so on. Lets call this column factorial and then use ~factorial as design, and then making contrasts to meaningfully describe your experiment.

0
Entering edit mode

Thank you. I already did that . But, I am not sure if it gives you the real picture (gene expression in a time-dependent manner) ???

Another Idea I had is to make 2 separate analysis using this design: 1) Time-course analysis with TCP and CO 2) same analysis with PH and CO

Of course, there will P-value difference as compared to what I would find with all the samples included. But, I think it works.

What do you think?

0
Entering edit mode

Do you mean use the full design and do the LRT test?

0
Entering edit mode
@mikelove
Last seen 2 hours ago
United States

For guidance on setting up your statistical analysis, I'd recommend working with a statistician or someone familiar with linear models. I unfortunately have to limit myself to software related posts on the support site.