Hello, I am new to DESeq2 and was able to run RUVSeq on my samples to help account for batch effects. When I run plotPCA() on the set after doing RUVg, the data clusters nicely, how I want it to.
I then moved the data to a
dds object to use DESeq2 using the code below:
dds <- DESeqDataSetFromMatrix(countData = counts(set2), colData = pData(set2), design = ~ W_1 + genotype) dds <- DESeq(dds)
I wanted to look at PCA plots using
vsd() next so I do that with the following code:
vsd <- vst(dds, blind=FALSE) vsd_nobatch <- removeBatchEffect(assay(vsd), design = model.matrix(~ set2@phenoData@data$genotype), covariates = set2@phenoData@data$W_1) plotPCA(vsd_nobatch, col=colors[metadata$genotype])
What I get is a PCA plot that clusters as expected, with the data points in the graph being the sample names (column names of
vsd_nobatch), colored based on genotype. However, what I would like to do is to show a PCA plot with the data points just being points, still colored based on genotype, and then overlaying different labels from the metadata (not just sample name).
The issue is that when I run
removeBatchEffect(), it returns a matrix, rather than a DESeq2 object, so I can't edit the labels like I can do before running
removeBatchEffect(), I can run
vsd, say that the groups are by
genotype, and then add a label based on
id on top of the point, as shown in the code below.
plotPCA(vsd, intgroup="genotype") + geom_text(aes(label=metadata$id), color = "black")
Is there a way I can get the same results after running 'removeBatchEffect()
? When I try running this, the code still runs, but it's as though it ignored the wholegeom_text` line- looks the same whether or not that line is included:
plotPCA(vsd_nobatch, intgroup="genotype")+ geom_text(aes(label=threeDSSmetadata$id), color = "black")
Thank you very much for your help! Please let me know if anything isn't clear from my question.
sessionInfo( ) R version 3.6.3 (2020-02-29) Platform: x86_64-apple-darwin15.6.0 (64-bit) Running under: macOS 10.16 Matrix products: default LAPACK: /Library/Frameworks/R.framework/Versions/3.6/Resources/lib/libRlapack.dylib locale:  en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 attached base packages:  parallel stats4 stats graphics grDevices utils  datasets methods base other attached packages:  ggplot2_3.3.3 pheatmap_1.0.12  RColorBrewer_1.1-2 RUVSeq_1.20.0  edgeR_3.28.1 limma_3.42.2  EDASeq_2.20.0 ShortRead_1.44.3  GenomicAlignments_1.22.1 Rsamtools_2.2.3  Biostrings_2.54.0 XVector_0.26.0  dplyr_1.0.5 DESeq2_1.26.0  SummarizedExperiment_1.16.1 DelayedArray_0.12.3  BiocParallel_1.20.1 matrixStats_0.58.0  Biobase_2.46.0 GenomicRanges_1.38.0  GenomeInfoDb_1.22.1 IRanges_2.20.2  S4Vectors_0.24.4 BiocGenerics_0.32.0