This is my first post on Bioconductor so if I am not clear please have some patience. I am working on an RNA-seq (human) dataset with 6 paired samples pre and post treatment and 2 independent controls. Unfortunately I don't have a bioinformatics background so I am trying to grasp the design matrix concept. I can do simple design differential expression analysis with EdgeR, but in this case I have a situation with paired samples, 2 independent samples and spike-in controls that I would like to use with RUVseq to remove variability. This means in my design matrix I have to account for the W_1 as a covariate.
How to design a matrix for this purpose? If it was only the paired samples it would be an additive model as described on the edger vignette and I would do something like:
design = model.matrix(~patient+W_1+Treat, data = pData(set1))
with W_1 from RUVseq with spike-in controls.
But in this case I also have 2 healthy controls that are completely independent from the patients pre and post treatment. So I would like to design a matrix so that I can compare differential expression within the patients pre and post treatment, between patients pre treatment and healthy controls and between patients post-treatment and healthy controls. So in the end my question is how to design a matrix and after fitting the model how to get the differential expression for each case.
If you also have any good examples to understand the design matrix and how to get the differential expression with glmLRT(fit) in edger that would be greatly appreciated.
Thanks for your kind help,