I know that there have been a number of questions asked on this topic already, but these questions seem to ask what happens in the case of 5 samples vs 3 samples, or even 150 vs 3. My question is regarding an experimental design comparing two experiments in which there are unique timepoints to the each experiment. That is to say that in experiment 1, i have a 12-hr timepoint that does not exist in experiment 2, and in experiment 2 i have a 3-day timepoint that does not exist in experiment 1. Aside from this, I have 3 overlapping timepoints - baseline, d1, and d2 timepoints.
My understanding of DeSeq2 is that as long as the biology is maintained (that is to say that these samples were treated the same), then the more information I feed into it, the more accurate a model it can generate. So I am inclined to feed Deseq2 all of the timepoints, but perform differential expression analysis limited to only the overlapping timepoints.
My questions are:
- Is this valid? Will I produce a more accurate model if I feed in Deseq2 all of the RNAseq samples i have?
- Will I be able to make perform differential expression analysis on non-overlapping timepoints if normalized to a shared control timepoint (baseline/untreated)? Will the integration of the 2 experiments still eliminate batch effects, even amongst samples that are non-overlapping?