I have RNAseq data for three conditions (A, B, and C) with results from three different batches (PK, GB, and EG) and I am trying to apply batch correction using DeSeq2. I am intersted in contrasting all three condtions (i.e. A vs B, B vs C and A vs C) while taking the batch effect into account. My code is as follow:
# Assign condition and batch
(condition <- factor(c(rep("A", 3), rep("B", 3), rep("C", 3))))
(batch <- factor (c("PK","GB", "EG","PK","GB", "EG","PK","GB", "EG")))
# Analysis with DESeq2 ----------------------------------------------------
library(DESeq2)
# Create a coldata frame and instantiate the DESeqDataSet
(coldata <- data.frame(row.names=colnames(countdata), condition, batch))
dds <- DESeqDataSetFromMatrix(countData=countdata, colData=coldata, design=~batch + condition)
# Run the DESeq pipeline
dds <- DESeq(dds)
When I run this code and then do resutsNames(dds), I get the following:
[1] "Intercept" "batch_GB_vs_EG" "batch_PK_vs_EG" "condition_B_vs_A" "condition_C_vs_A"
However, I am also interested in "conditionAvsC" and I am not sure why this is not being created by this design formula. Also, I am not sure why "batchPKvsGB" is not being created. Is there something wrong with my design formula?
Thanks for your help!
Thank you! Just another quick follow-up question: I want to get normalized counts corrected for the batch effect for all three conditions. I know that I can get normalized counts using "counts (dds, normalized=TRUE)", but I am not sure if the counts here is corrected for the batch effect?
I’d recommend vst followed by the limma::removeBatchEffect function.
I have actually used that function to do this before. I am just wondering if the normalized counts generated by DeSeq2 are batch corrected? And more importantly, are the DE genes that I get using the results function batch corrected?
DE results control for batch (this is the same design as the vignette where we talk about controlling for covariates).
No, counts normalized or not, as well as transformed counts (vst, rlog) are not batch corrected in DESeq2.