Hello,
I am currently using DESeq2 to run some test on a case/control RNA-seq dataset with 2 time-point measurements (i.e. each sample has 2 measurements and is either classified as case or control). Cases and controls are not matched/paired.
I have read the tutorial and any other relevant info I could find but I can’t quite understand what is the role of the PatientID in the design formula. Most importantly I can’t decide whether I do need to add this in my design when I am looking for differentially expressed genes between cases and controls or between the 2 time points. Is this PatientID term essential in the formula as a flag for the presence of the same sample more than once (different time point) or does this term correct for the inter-individual variability in the model? I would really appreciate your help on this.
My question is more general but I’m also including a snapshot of the colData(dds) for your inconvenience:
|
|||||||||||||||||||||||||||||||||||||||||||||||
Thank you, Olga |
|
Thank you James for your reply. The example of t-test helped me a lot getting a better idea of the effect of PatientID inclusion in the model. So it's now clear how I should compare the expression of each group (cases/controls) between the two visits.
However, I still have some doubts about the model I should use to look differential expression of cases and controls, controlling for the differences in the expression between the two visits design, i.e. design(dds)<-formula(~Visit+Phenotype). In this case the comparison is between "unrelated" groups, even if in the colData(dds) are included samples from same patient. So in this example would you include the information of PatientID in the model or not?
Thank you again. Your help is much appreciated.
So you have cases and controls, and each patient had two visits, right? In that situation you usually want to know about the interaction (e.g., you want to know if the differences between visit 1 and visit 2 are the same for cases and controls or not). If this is correct, then there is quite a bit of exposition in the DESeq2 vignette that covers the situation.
Thank you again, I really appreciate the help. I have gone through the vignette but I was not sure which model was the most appropriate for the analysis I had in my mind. I'll check DESeq2 more targeted now. Thank you again