Is DESeq2 PCAplot already scaled?
1
0
Entering edit mode
@sofiagreen72211-13691
Last seen 2.9 years ago
United States

Hi All,

I'm trying to draw scaled PCA plot for my sequencing data while using DESeq2. The code, I am using is:

ds <- DESeqDataSetFromMatrix(countData=counts, colData=samples, design=~condition) 
ds$condition <- relevel(ds$condition, "Ctrl")
ds <- ds[rowSums(counts(ds)>1) >= 2, ] #PCA plot

# transformation
rld_ds <- rlog(ds, blind=FALSE) #regularized log-transformation

#PCA plot1
plotPCA(rld_ds, intgroup = c("SampleName"))

# PCA plot 2 using ggplot2
(pcadata <- plotPCA(rld_ds, intgroup = c( "condition", "SampleName"), returnData=TRUE))
(percentVar <- round(100 * attr(pcadata, "percentVar"))

ggplot(pcadata, aes(PC1, PC2, color=condition, label=SampleName)) +
  geom_point(size=3) +
  geom_text(aes(label=SampleName),hjust=.5, vjust=-1, size=3) +
  xlab(paste0("PC1: ",percentVar[1],"% variance")) + 
  ylab(paste0("PC2: ",percentVar[2],"% variance")) #plot 2

Now, my question:

Are these PCA plots already scaled due to rld transformation? if not, how can I do that?

Thanks,

Sofia

 

deseq2 • 3.6k views
ADD COMMENT
1
Entering edit mode
@steve-lianoglou-2771
Last seen 14 months ago
United States

You can take a look at the source for plotPCA by fetching it like so:

library(DESeq2)
getMethod("plotPCA", "DESeqTransform")

You will find in there the call to prcomp that does the PCA, namely:

pca <- prcomp(t(assay(object)[select, ]))

and a call to ?prcomp will show you the default values for the center and scale. paramenters (TRUE, and FALSE, respecitvely).

If you want to change the behavior, you can make your own pcaPlot function by copying the the code of that function into your own and tweaking accordingly.

ADD COMMENT
0
Entering edit mode

Hi Steve:

Thank you for your quick response.

As I am passing rlog transformation object (rld_ds) to plotPCA which stabilizes variance applied to the raw data, so do I still need to scale my PCAplot with:

pca <- prcomp(rld_ds, center=T, scale=T)

Thank you for your time. I look forward to learning your advice. 

Sofia

ADD REPLY
1
Entering edit mode

I think it's best not to scale the data (make all the genes have same variance). I've posted an argument on the support site before about this, but it's impossible to find again. Basically, the rlog or vst put the data on a scale so there are no systematic differences due to the dynamic range. Forcing each gene to have variance one makes it worse (reduces signal and amplifies noise).

ADD REPLY
0
Entering edit mode

Hi Michael:

Thank you very much for your reply. :)

Many thanks,

Sofia

ADD REPLY
0
Entering edit mode

I've tried to find that post several times now, too! Sadly I've had no luck with it either :-/

ADD REPLY
0
Entering edit mode

Yes! Thanks, bookmarked...

ADD REPLY
0
Entering edit mode

How did you divine that post?

Are you a God?

 

ADD REPLY
1
Entering edit mode

I utilized my l33t Google skillz:

pca plot scale love site:support.bioconductor.org

the search on this site is worthless, so I always just use Google directly with the site qualifier. It's the first hit on the second page (at least based on what Google thinks I want).

ADD REPLY
0
Entering edit mode

nice move -- strangely(?), it's also the it's also the first on the second page for me, too.

Maybe the search box on this site should just urlencode a google query with site:support.bioconductor.org appended to the end and bypass the internal search functionality entirely.

ADD REPLY
0
Entering edit mode
ADD REPLY

Login before adding your answer.

Traffic: 454 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6