Batch correction with DeSeq2: problem with creating model and getting all conditions
1
0
Entering edit mode
pkhadka • 0
@pkhadka-20024
Last seen 5.2 years ago

I have RNAseq data for three conditions (A, B, and C) with results from three different batches (PK, GB, and EG) and I am trying to apply batch correction using DeSeq2. I am intersted in contrasting all three condtions (i.e. A vs B, B vs C and A vs C) while taking the batch effect into account. My code is as follow:

# Assign condition and batch
(condition <- factor(c(rep("A", 3), rep("B", 3), rep("C", 3))))
(batch <- factor (c("PK","GB", "EG","PK","GB", "EG","PK","GB", "EG")))

# Analysis with DESeq2 ----------------------------------------------------

library(DESeq2)

# Create a coldata frame and instantiate the DESeqDataSet
(coldata <- data.frame(row.names=colnames(countdata), condition, batch))
dds <- DESeqDataSetFromMatrix(countData=countdata, colData=coldata, design=~batch + condition)

# Run the DESeq pipeline
dds <- DESeq(dds)

When I run this code and then do resutsNames(dds), I get the following:

[1] "Intercept"        "batch_GB_vs_EG"   "batch_PK_vs_EG"   "condition_B_vs_A" "condition_C_vs_A"

However, I am also interested in "conditionAvsC" and I am not sure why this is not being created by this design formula. Also, I am not sure why "batchPKvsGB" is not being created. Is there something wrong with my design formula?

Thanks for your help!

deseq2 • 788 views
ADD COMMENT
0
Entering edit mode
@mikelove
Last seen 5 hours ago
United States

You can extract contrasts of interest using the contrast argument of results. Take a look at the help page:

?results

The answer as to why certain coefficients are listed involves how linear models are represented in R. You can take a look at course material about linear models, for example the course I taught with Rafael Irizarry on Edx has some online material about linear models.

ADD COMMENT
0
Entering edit mode

Thank you! Just another quick follow-up question: I want to get normalized counts corrected for the batch effect for all three conditions. I know that I can get normalized counts using "counts (dds, normalized=TRUE)", but I am not sure if the counts here is corrected for the batch effect?

ADD REPLY
0
Entering edit mode

I’d recommend vst followed by the limma::removeBatchEffect function.

ADD REPLY
0
Entering edit mode

I have actually used that function to do this before. I am just wondering if the normalized counts generated by DeSeq2 are batch corrected? And more importantly, are the DE genes that I get using the results function batch corrected?

ADD REPLY
0
Entering edit mode

DE results control for batch (this is the same design as the vignette where we talk about controlling for covariates).

No, counts normalized or not, as well as transformed counts (vst, rlog) are not batch corrected in DESeq2.

ADD REPLY

Login before adding your answer.

Traffic: 776 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6