Hi, I'm new to bioinformatics so sorry if the question is a bit naive.
i've been trying to understand how to design a model matrix for deseq2 in this experiment design
control | 2h |
control | 2h |
control | 2h |
control | 2h |
control | 2h |
control | 2h |
control | 6h |
control | 6h |
control | 6h |
control | 6h |
control | 6h |
untreated | 0h |
untreated | 0h |
untreated | 0h |
untreated | 0h |
untreated | 0h |
untreated | 0h |
wnt3a | 2h |
wnt3a | 2h |
wnt3a | 2h |
wnt3a | 2h |
wnt3a | 2h |
wnt3a | 2h |
wnt3a | 6h |
wnt3a | 6h |
wnt3a | 6h |
wnt3a | 6h |
wnt3a | 6h |
wnt3a | 6h |
basically the untreated is different than control because control is still treated with media without wnt3a
I'm having difficulties representing this data in a correct design formula for deseq2 because obviously the condition untreated is linearly dependend to timepoint 0. i can use ~timepoint + condition if i set the untreated to control but i think the correct formula for this experiment would be something like ~timepoint + condition + timepoint:condition, that as i understand means that the effect of the timpoint depends on what condition it is. the problem is that now the matrix is not full rank again. splitting the controls between the two conditions doesn't really seem right to me as i would have 3 replicates each instead of 6.
any suggestions?
thanks
This is just some test data i downloaded from a public repository, but my idea would be to see for example which genes get activated during wnt3a activation over time, so differentially expressed genes from time 0 to time point 2, and from time point 0 to 6 compared to control. I also wanted to see if i could apply the time course part on the RNA-seq workflow tutorial. so time point 0, 2 and 6 separating the two conditions.
Thanks again
Here's a tricky thing, which catches lots of folks: let's say we have group = control, A, B, and C. What does it mean to say B vs A compared to control? Usually people want (B-control) - (A-control). This ends up being equivalent to B - A. It would be different if you had (B-control_B) - (A-control_A), and we have different ways of doing this design, but with a single control, it drops out of the equation, so you really just need to add a new variable 'group' where you combine the condition and time point above, and then use ~group as a design.
ok so if i understood correctly i should just combine the two columns together and call it as a single variable and then run DESeq with just one variable right?
Yes, that's easiest here.
Perfect, Thanks!