Question: How can I use DESeq2 for exploratory analysis after batch effect removal using RUVSeq?
0
23 months ago by
dbouzo0
dbouzo0 wrote:

Hi all,

I am conducting some RNA-Seq experiments to determine differentially expressed genes after treatment with various antimicrobials.  I have used RUVSeq to remove a batch effect present in my data set and DESeq2 to estimate differential expression.

I used the EDASeq::plotPCA function on the SeqExpressionSet generated from the RUVs argument and it shows my samples subjected to the same treatment are clustering closer together.  I then used this SeqExpressionSet to estimate differential expression in DESeq2 with the code:

dds <- DESeqDataSetFromMatrix(countData = counts(set_postRUVs_W1),
colData = pData(set_postRUVs_W1),
design = ~ group +  W_1)
dds <- DESeq(dds)

I would now like to use some of the exploratory analysis techniques in the DESeq2 vignette to compare my data before and after removing the batch effect, specifically how the different samples and most variable genes are clustering.  Following the code in the vignette, there is no difference in clustering before and after removing the batch effect.  The same goes for a PCA analysis of the rlog transformed dds object - the samples are clustering just the same as they were before the batch removal with RUVSeq.

Excuse my naïvety but is this because the sample distances are determined based on the sizeFactors column of the DESeq assay object, and it is not factoring in the W_1 column from the set_postRUVs_W1 object?

I'd be grateful for any advice on how to address this or if I have missed something working between RUVSeq and DESeq2.

Thank you!

modified 23 months ago by Michael Love25k • written 23 months ago by dbouzo0
Answer: How can I use DESeq2 for exploratory analysis after batch effect removal using R
0
23 months ago by
Michael Love25k
United States
Michael Love25k wrote:

You can use limma's removeBatchEffect(x, batch) with the transformed data matrix (assay(vsd) or assay(rld)) and the W column from RUV. This will remove mean shifts in the transformed counts associated with the W column. This is similar to what is done inside of the model when you run DESeq() with a design of ~W + condition.

Sorry,

I have such design

    condition   batch
A1  treatment   1
A2  treatment   1
A3  treatment   1
A4  treatment   1
A5  treatment   1
A6  treatment   1
A7  control     1
A8  control     1
A9  control     1
A10 control     1
A11 control     1
A12 control     1
B1  treatment   2
B2  treatment   2
B3  treatment   2
B4  treatment   2
B5  treatment   2
B6  treatment   2
B7  control     2
B8  control     2
B9  control     2
B10 control     2
B11 control     2
B12 control     2


I am only interested in knowing genes related to batch (two experiments) ignoring any change due to condition (treatment); Does this give me these genes?

Thank you

dds <- DESeqDataSetFromMatrix(countData = countData,colData = colData,design = ~ batch + condition)


This is unrelated to the thread. In general, you should make a new post if the question isn't related.

That line of code builds a dataset, it doesn't give you any results. You can use the results() function to extract the results for the batch coefficient. Read over the help file: ?results and the vignette for details.

Content
Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.