I have a question regarding a RNA-seq data set I have.
This data consists of 2 disease groups (Disease 1, Disease 2), 2 treatments (Treatment 1, Treatment 2), and 2 times points (Timepoint 1, Timepoint 2). And in total 30 Patients.
I want to make the following comparisons:
- Timepoint 2 vs Timepoint 1 for Patients on Treatment 1
- Timepoint 2 vs Timepoint 1 for Patients on Treatment 2
- Treatment 1 vs Treatment 2 irrespective of Disease at Timepoint 2
- Treatment 1 vs Treatment 2 for Patients with Disease 1 at Timepoint 2
- Treatment 1 vs Treatment 2 for Patients with Disease 2 at Timepoint 2
What would be the best way to design the model and do the comparisons?
Thank you - yes I have read both manuals which are very helpful. However, I couldn't find an example that did exactly what I wanted. I have done the following:
The comparisons you describe are not 100% clear, but the way I interpret them, qlf1 and qlf2 answer your first two comparisons. The remainder are all within timepoint2 comparisons, at which point there is no longer any pairing, so you can just make the usual comparisons. Normally the best thing to do is to fit a cell means model and make the correct comparisons using contrasts. Something like
Your qlf3 comparison is testing for any genes that change expression between time 1 and time 2 for either treatment, and your qlf4 is doing the interaction term, where you are asking for genes that react differently between timepoints depending on the treatment.
Yes, you got the comparisons I am trying to make! Thanks for the explanation. I was just wondering how do you adjust for baseline differences between the patients in your model? Should you not consider Donors for the model?
You could. As I noted, qlf3 is already doing an F-test for any difference between time 1 and 2 for either treatment. In other words, Treatment1.Timepoint2 and Treatment1.Timepoint2 in your model are the difference between time 2 and time 1, within each of the two treatment arms. And qlf3 is doing an F-test that tests for either (or both) of those coefficients being different from zero. You could also do
Which would test that the average of the difference between time 1 and 2 for the two treatments is different from zero, which would adjust for the baseline donor effect. But that is not the same as what you asked for earlier.