How to retain metadata after running RemoveBatchEffect() after RUVSeq to format PCA plot?
Entering edit mode
Last seen 13 months ago
United States

Hello, I am new to DESeq2 and was able to run RUVSeq on my samples to help account for batch effects. When I run plotPCA() on the set after doing RUVg, the data clusters nicely, how I want it to.

I then moved the data to a dds object to use DESeq2 using the code below:

dds <- DESeqDataSetFromMatrix(countData = counts(set2),
                        colData = pData(set2),
                        design = ~ W_1 + genotype)
dds <- DESeq(dds)

I wanted to look at PCA plots using vsd() next so I do that with the following code:

vsd <- vst(dds, blind=FALSE)

vsd_nobatch <- removeBatchEffect(assay(vsd), 
                              design = model.matrix(~ set2@phenoData@data$genotype), 
                              covariates = set2@phenoData@data$W_1)

plotPCA(vsd_nobatch, col=colors[metadata$genotype])

What I get is a PCA plot that clusters as expected, with the data points in the graph being the sample names (column names of vsd_nobatch), colored based on genotype. However, what I would like to do is to show a PCA plot with the data points just being points, still colored based on genotype, and then overlaying different labels from the metadata (not just sample name).

The issue is that when I run removeBatchEffect(), it returns a matrix, rather than a DESeq2 object, so I can't edit the labels like I can do before running removeBatchEffect(). Pre-removeBatchEffect(), I can run plotPCA() on vsd, say that the groups are by genotype, and then add a label based on id on top of the point, as shown in the code below.

plotPCA(vsd, intgroup="genotype") +
  geom_text(aes(label=metadata$id), color = "black")

Is there a way I can get the same results after running 'removeBatchEffect()? When I try running this, the code still runs, but it's as though it ignored the wholegeom_text` line- looks the same whether or not that line is included:

plotPCA(vsd_nobatch, intgroup="genotype")+
  geom_text(aes(label=threeDSSmetadata$id), color = "black")

Thank you very much for your help! Please let me know if anything isn't clear from my question.

sessionInfo( )
R version 3.6.3 (2020-02-29)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS  10.16

Matrix products: default
LAPACK: /Library/Frameworks/R.framework/Versions/3.6/Resources/lib/libRlapack.dylib

[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] parallel  stats4    stats     graphics  grDevices utils    
[7] datasets  methods   base     

other attached packages:
 [1] ggplot2_3.3.3               pheatmap_1.0.12            
 [3] RColorBrewer_1.1-2          RUVSeq_1.20.0              
 [5] edgeR_3.28.1                limma_3.42.2               
 [7] EDASeq_2.20.0               ShortRead_1.44.3           
 [9] GenomicAlignments_1.22.1    Rsamtools_2.2.3            
[11] Biostrings_2.54.0           XVector_0.26.0             
[13] dplyr_1.0.5                 DESeq2_1.26.0              
[15] SummarizedExperiment_1.16.1 DelayedArray_0.12.3        
[17] BiocParallel_1.20.1         matrixStats_0.58.0         
[19] Biobase_2.46.0              GenomicRanges_1.38.0       
[21] GenomeInfoDb_1.22.1         IRanges_2.20.2             
[23] S4Vectors_0.24.4            BiocGenerics_0.32.0
plotPCA DESeq2 RUVSeq removebatcheffect() • 467 views
Entering edit mode
Last seen 18 hours ago
United States

Note the exact code used in the vignette:

It is assigned to assay().

Entering edit mode

Ah I got it now, thank you very much!


Login before adding your answer.

Traffic: 235 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6