Hey community!
As a pure behavioural ecologist who has stumbled into the world of gene expression analysis and am a novice in analyzing it, I am asking for help in validating whether my model is correct for the type of analysis I want to do. I have been reading the DESeq2 manual and trying to figure out the correct way to analyze my data.
My experiment is about finding differential gene expression in ants facing a pathogen infection, so I have samples from three timepoints (Time: 0, 1, 2) and a condition (Treatment: control, infected). I am using colony fragments from a shared original nests for both the conditions over all timepoints, meaning I should control for a batch effect for the model (original nest, in the model as Replicate).
I approached the analysis as follows:
dds <- DESeqDataSetFromMatrix(countData = tempdf,
colData = sfile,
design=~ Replicate + Treatment + Time + Treatment:Time)
dds <- DESeq(dds)
Using contrasts, I wanted to specifically focus on differential expression over timepoint 1 and timepoint 2, which I got from the interaction term Treatment:Time
resT1 <- results(dds, name="TreatmentInfected.Time1", test = "Wald")
resT2 <- results(dds, name="TreatmentInfected.Time2", test="Wald")
It was a bit unclear to me whether I should use these types of comparisons or whether to include a reduced model with an LRT test, or whether the interaction model would be the correct way to go.
The results of these models seem logical enough, quite in line with a quick and dirty look at the expression counts, and finding with quite good accuracy the same differentially expressed genes as an attempt with EdgeR (however I could not get the batch effect to work there for some reason).
I would be extremely greatful for any comments on whether my approach is correct or whether I should modify my model and take another approach!