I have a RNA-Seq time series with two conditions (SD and LD) and 3 time points (0, 2, and 4). There are three biological replicates per sample. I have used DESeq2 to do a likelihood ratio test on the following designs:
full ~ condition + time + condition:time reduced ~ condition + time
I understand that this tests for the effects of the two conditions on the pattern of expression which is what I want. Here are the names of my coefficients:
> resultsNames(dds_test) [1] "Intercept" "condition_LD_vs_SD" "time_2_vs_0" "time_4_vs_0" "conditionLD.time2" [6] "conditionLD.time4"
I would also like to contrast the difference between the conditions at the three time points and between time points within each condition. I was hoping someone could confirm my interpretations and answer a question about comparing time points which are not the baseline.
So am I correct in thinking I can test the difference between the conditions at the three timepoints like so:
results(dds_test, name="condition_LD_vs_SD", test="Wald") results(dds_test, contrast=list(c("condition_LD_vs_SD","conditionLD.time2")), test="Wald") results(dds_test, contrast=list(c("condition_LD_vs_SD","conditionLD.time4")), test="Wald")
and between time points within each condition like so:
- Condition SD
results(dds_test, contrast=list(c("time_2_vs_0")), test="Wald") results(dds_test, contrast=list(c("time_4_vs_0")), test="Wald")
- Condition LD
results(dds_test, contrast=list(c("time_2_vs_0", "conditionLD.time2")), test="Wald") results(dds_test, contrast=list(c("time_4_vs_0", "conditionLD.time4")), test="Wald")
Is this correct and is there an easy way to calculate and test the fold changes between time points 2 and 4?
This is my sample table:
sample condition time replicate LD_ZT4_1 LD_ZT4_1 LD 4 1 LD_ZT4_2 LD_ZT4_2 LD 4 2 LD_ZT4_3 LD_ZT4_3 LD 4 3 SD_ZT4_1 SD_ZT4_1 SD 4 1 SD_ZT4_2 SD_ZT4_2 SD 4 2 SD_ZT4_3 SD_ZT4_3 SD 4 3 SD_ZT0_1 SD_ZT0_1 SD 0 1 SD_ZT0_2 SD_ZT0_2 SD 0 2 SD_ZT0_3 SD_ZT0_3 SD 0 3 SD_ZT2_1 SD_ZT2_1 SD 2 1 SD_ZT2_2 SD_ZT2_2 SD 2 2 SD_ZT2_3 SD_ZT2_3 SD 2 3 LD_ZT0_1 LD_ZT0_1 LD 0 1 LD_ZT0_2 LD_ZT0_2 LD 0 2 LD_ZT0_3 LD_ZT0_3 LD 0 3 LD_ZT2_1 LD_ZT2_1 LD 2 1 LD_ZT2_2 LD_ZT2_2 LD 2 2 LD_ZT2_3 LD_ZT2_3 LD 2 3
Note, I updated the answer above.
how to find the difference at time 4 vs 2 in each condition ?
There won’t be such a coefficient. This is possible but a bit difficult with the interaction design. Better to combine conditon and time variables into one called “group”, see Interaction section of vignette.
Hello Michael! I used the two methods to get the DEG list, namely the "group" and the interaction design. But the results of the two methods are different. the number of DEGs using the interaction design is larger. Is this normal? many thanks for your help.
Can you plot the LFC? Are they comparable? If they are not highly correlated, I would guess you are not comparing similar things.
Yes, You are correct. I'm not comparing the same thing. Sorry for my negligence. I have another question. Attached is my metadata. The most interesting factors I want to compare are treatment and day, but the genotype and donor also seem to contribute to the variation. I want to ask if I can get the right answer by combining the treatment and day into "group", ignoring the factor of genotype and donor.
PS. It seems that the additional genotype and donor in the interaction design is responsible for the larger DEGs.
Thanks again.
For choosing an appropriate design, I'd recommend to collaborate with a statistician. I unfortunately only have time here for software questions.
Sorry. I'm in a medical school, and don't know a statistician. Can you link me to some materials?
My role here is to provide software support for my Bioconductor packages.
I have many other commitments, so I have to protect my time, while still handling the large number of support requests I receive every week.
If you are in a medical school, there is likely someone at your institution you could consult with.
Got it. Thanks for your advice~