I have an experiment with 45 samples. These 45 samples consist of 3 different surgeries, two treatments and two time periods for a total of 10 different experimental conditions with 4-5 experimental replicates for each.
I'm using DESeq2 to analyze for DE of genes from RNAcounts.
There are really multiple comparisons we are interested in
- Surgeries versus Sham at two time points
- Surgery 1 over sham vs. Surgery2 vs. Sham
- Treatment vs control (treatment is different from surgery)
- Surgery differences in comparison to treatment vs. control
My question revolves around best practices. I originally ran DESeq2 on (treatment) which is a concatenation of two treatments, 3 surgeries and two time periods (this is 10 groups of 4-5 replicates each, because time period 2 is missing one surgery). design ~ treatment
Then I have been using contrast statements to examine individual comparisons such as the following:
res.d1.ozone <-results(dds,contrast=c("treatment","Day1OzoneSHAM","Day1AirSHAM"),parallel = TRUE)
This compares time period one, sham surgery, Ozone to Air. etc...
My question: Should I be running a separate dds (and normalization etc...) followed by:
dds <- DESeq(dds) res <- results(dds)
on subsets of data (in this case 4reps of each) or am I better off normalizing all day in a single dds as I have done on all 45 samples and using contrast statements as above to evaluate?
Also, the second time point is way different than the first in terms of outcome, so I won't be using that to create a multi-factor design. I will use a multi-factor design within a time period to exam both treatment and surgery factors at the same time.