How to retain metadata after running RemoveBatchEffect() after RUVSeq to format PCA plot?
1
0
Entering edit mode
@97dfc144
Last seen 2.9 years ago
United States

Hello, I am new to DESeq2 and was able to run RUVSeq on my samples to help account for batch effects. When I run plotPCA() on the set after doing RUVg, the data clusters nicely, how I want it to.

I then moved the data to a dds object to use DESeq2 using the code below:

dds <- DESeqDataSetFromMatrix(countData = counts(set2),
                        colData = pData(set2),
                        design = ~ W_1 + genotype)
dds <- DESeq(dds)

I wanted to look at PCA plots using vsd() next so I do that with the following code:

vsd <- vst(dds, blind=FALSE)

vsd_nobatch <- removeBatchEffect(assay(vsd), 
                              design = model.matrix(~ set2@phenoData@data$genotype), 
                              covariates = set2@phenoData@data$W_1)

plotPCA(vsd_nobatch, col=colors[metadata$genotype])

What I get is a PCA plot that clusters as expected, with the data points in the graph being the sample names (column names of vsd_nobatch), colored based on genotype. However, what I would like to do is to show a PCA plot with the data points just being points, still colored based on genotype, and then overlaying different labels from the metadata (not just sample name).

The issue is that when I run removeBatchEffect(), it returns a matrix, rather than a DESeq2 object, so I can't edit the labels like I can do before running removeBatchEffect(). Pre-removeBatchEffect(), I can run plotPCA() on vsd, say that the groups are by genotype, and then add a label based on id on top of the point, as shown in the code below.

plotPCA(vsd, intgroup="genotype") +
  geom_text(aes(label=metadata$id), color = "black")

Is there a way I can get the same results after running 'removeBatchEffect()? When I try running this, the code still runs, but it's as though it ignored the wholegeom_text` line- looks the same whether or not that line is included:

plotPCA(vsd_nobatch, intgroup="genotype")+
  geom_text(aes(label=threeDSSmetadata$id), color = "black")

Thank you very much for your help! Please let me know if anything isn't clear from my question.

sessionInfo( )
R version 3.6.3 (2020-02-29)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS  10.16

Matrix products: default
LAPACK: /Library/Frameworks/R.framework/Versions/3.6/Resources/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] parallel  stats4    stats     graphics  grDevices utils    
[7] datasets  methods   base     

other attached packages:
 [1] ggplot2_3.3.3               pheatmap_1.0.12            
 [3] RColorBrewer_1.1-2          RUVSeq_1.20.0              
 [5] edgeR_3.28.1                limma_3.42.2               
 [7] EDASeq_2.20.0               ShortRead_1.44.3           
 [9] GenomicAlignments_1.22.1    Rsamtools_2.2.3            
[11] Biostrings_2.54.0           XVector_0.26.0             
[13] dplyr_1.0.5                 DESeq2_1.26.0              
[15] SummarizedExperiment_1.16.1 DelayedArray_0.12.3        
[17] BiocParallel_1.20.1         matrixStats_0.58.0         
[19] Biobase_2.46.0              GenomicRanges_1.38.0       
[21] GenomeInfoDb_1.22.1         IRanges_2.20.2             
[23] S4Vectors_0.24.4            BiocGenerics_0.32.0
plotPCA DESeq2 RUVSeq removebatcheffect() • 1.3k views
ADD COMMENT
2
Entering edit mode
@mikelove
Last seen 6 hours ago
United States

Note the exact code used in the vignette:

https://bioconductor.org/packages/release/bioc/vignettes/DESeq2/inst/doc/DESeq2.html#why-after-vst-are-there-still-batches-in-the-pca-plot

It is assigned to assay().

ADD COMMENT
0
Entering edit mode

Ah I got it now, thank you very much!

ADD REPLY

Login before adding your answer.

Traffic: 806 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6