I am a microbiology grad student new to bioinformatics, conducting some RNA-Seq experiments to determine differentially expressed genes after antimicrobial treatment.
I have an untreated control and 4 different antimicrobial treatments all conducted on the same cell type/microorganism with 3 biological replicates for each. The only factor changing between these groups is the treatment applied to them.
Initially I determined differentially expressed genes for each treatment compared to the untreated control separately each with their own DESeqDataSet objects. This made it difficult to compare DE between groups and visualise these as heat maps etc. After some reading I generated one DESeqDataSet object which included all treatments, and then apply the contrasts argument to determine DEG for each treatment compared to the untreated control.
First I set the reference level:
dds$condition <- relevel(dds$condition, ref="untreated")
To determine differential expression:
dds <- DESeq(dds) res_treatment1 <- results(dds, alpha=0.05, lfcThreshold = 1, altHypothesis="greaterAbs", contrast = c("condition", "treatment1", "untreated"))
The number of differentially expressed genes, outliers and low count genes were quite different between these two approaches despite using the same BAM files of alignments and same FDR and LFC thresholds.
Despite reading the DESeq2 manual I was still unsure which approach was more appropriate - any advice is most welcome. Thank you!