Question

How can I use DESeq2 for exploratory analysis after batch effect removal using RUVSeq?

0

Entering edit mode

dbouzo • 0

@dbouzo-13452

Last seen 6.3 years ago

Hi all,

I am conducting some RNA-Seq experiments to determine differentially expressed genes after treatment with various antimicrobials. I have used RUVSeq to remove a batch effect present in my data set and DESeq2 to estimate differential expression.

I used the EDASeq::plotPCA function on the SeqExpressionSet generated from the RUVs argument and it shows my samples subjected to the same treatment are clustering closer together. I then used this SeqExpressionSet to estimate differential expression in DESeq2 with the code:

dds <- DESeqDataSetFromMatrix(countData = counts(set_postRUVs_W1), 
                               colData = pData(set_postRUVs_W1), 
                               design = ~ group +  W_1)
dds <- DESeq(dds)

I would now like to use some of the exploratory analysis techniques in the DESeq2 vignette to compare my data before and after removing the batch effect, specifically how the different samples and most variable genes are clustering. Following the code in the vignette, there is no difference in clustering before and after removing the batch effect. The same goes for a PCA analysis of the rlog transformed dds object - the samples are clustering just the same as they were before the batch removal with RUVSeq.

Excuse my naïvety but is this because the sample distances are determined based on the sizeFactors column of the DESeq assay object, and it is not factoring in the W_1 column from the set_postRUVs_W1 object?

I'd be grateful for any advice on how to address this or if I have missed something working between RUVSeq and DESeq2.

Thank you!

deseq2 ruvseq rna-seq batch effect • 2.1k views

ADD COMMENT • link updated 7.0 years ago by Michael Love 43k • written 7.0 years ago by dbouzo • 0

score 0 · Answer 1 · 2017-11-20

0

Entering edit mode

Michael Love 43k

@mikelove

Last seen 9 hours ago

United States

You can use limma's removeBatchEffect(x, batch) with the transformed data matrix (assay(vsd) or assay(rld)) and the W column from RUV. This will remove mean shifts in the transformed counts associated with the W column. This is similar to what is done inside of the model when you run DESeq() with a design of ~W + condition.

ADD COMMENT • link 7.0 years ago Michael Love 43k

0

Entering edit mode

Sorry,

I have such design

    condition   batch
A1  treatment   1
A2  treatment   1
A3  treatment   1
A4  treatment   1
A5  treatment   1
A6  treatment   1
A7  control     1
A8  control     1
A9  control     1
A10 control     1
A11 control     1
A12 control     1
B1  treatment   2
B2  treatment   2
B3  treatment   2
B4  treatment   2
B5  treatment   2
B6  treatment   2
B7  control     2
B8  control     2
B9  control     2
B10 control     2
B11 control     2
B12 control     2

I am only interested in knowing genes related to batch (two experiments) ignoring any change due to condition (treatment); Does this give me these genes?

Thank you

dds <- DESeqDataSetFromMatrix(countData = countData,colData = colData,design = ~ batch + condition)

ADD REPLY • link 5.8 years ago AZ ▴ 30

0

Entering edit mode

This is unrelated to the thread. In general, you should make a new post if the question isn't related.

That line of code builds a dataset, it doesn't give you any results. You can use the results() function to extract the results for the batch coefficient. Read over the help file: ?results and the vignette for details.

ADD REPLY • link 5.8 years ago Michael Love 43k