Hi all,
I am conducting some RNA-Seq experiments to determine differentially expressed genes after treatment with various antimicrobials. I have used RUVSeq to remove a batch effect present in my data set and DESeq2 to estimate differential expression.
I used the EDASeq::plotPCA function on the SeqExpressionSet generated from the RUVs argument and it shows my samples subjected to the same treatment are clustering closer together. I then used this SeqExpressionSet to estimate differential expression in DESeq2 with the code:
dds <- DESeqDataSetFromMatrix(countData = counts(set_postRUVs_W1),
colData = pData(set_postRUVs_W1),
design = ~ group + W_1)
dds <- DESeq(dds)
I would now like to use some of the exploratory analysis techniques in the DESeq2 vignette to compare my data before and after removing the batch effect, specifically how the different samples and most variable genes are clustering. Following the code in the vignette, there is no difference in clustering before and after removing the batch effect. The same goes for a PCA analysis of the rlog transformed dds object - the samples are clustering just the same as they were before the batch removal with RUVSeq.
Excuse my naïvety but is this because the sample distances are determined based on the sizeFactors column of the DESeq assay object, and it is not factoring in the W_1 column from the set_postRUVs_W1 object?
I'd be grateful for any advice on how to address this or if I have missed something working between RUVSeq and DESeq2.
Thank you!
Sorry,
I have such design
I am only interested in knowing genes related to batch (two experiments) ignoring any change due to condition (treatment); Does this give me these genes?
Thank you
This is unrelated to the thread. In general, you should make a new post if the question isn't related.
That line of code builds a dataset, it doesn't give you any results. You can use the
results()
function to extract the results for the batch coefficient. Read over the help file:?results
and the vignette for details.