Question: Different PCA plots using rlog and vsd on the same data set
0
gravatar for lirongrossmann
2.2 years ago by
lirongrossmann40 wrote:

Hi All,

I have been using Deseq2 to analyze a dataset I have and ran into a problem I am not sure how to solve.

I have been using the following code to run deseq2 on my dataset:

 

dds <-DESeqDataSetFromMatrix(countData = ep,colData = cp,design = ~Risk)

dds <- estimateSizeFactors(dds)

rld <- rlog(dds)

plotPCA(rld, intgroup="Risk")

vsd <- varianceStabilizingTransformation(dds)

plotPCA(vsd, intgroup="Risk")

 

The two PCA plots I got look completely different, so I am not sure which transformation I should rely on for further analysis. 

Any help?

Thanks

 

ADD COMMENTlink modified 2.2 years ago • written 2.2 years ago by lirongrossmann40

 Thanks I have 2 groups that I want to compare in my dataset (of rna seq data) - one group contains 6 samples the other group contains 100 samples. 

When I run Deseq2 I get more than 1000 DE genes. But for  some reason when I plot pca using vsd and again using rlog I see different separation of the groups.

Interestingly, when I narrowed my analysis to 6 vs 6 the plots do look similar. 

Is it a known problem comparing highly unequal number of groups?

Thanks!

 

 

ADD REPLYlink written 2.2 years ago by lirongrossmann40

Try blind=FALSE. This is recommended in the vignette when there are many large differences 

ADD REPLYlink written 2.2 years ago by Michael Love26k

Thank you. I tried to use it with the top 30 genes and it didn't work. I was wondering if the highly unequal size of the two compared groups bias the pca and the clustering, because when I narrow down to equal size of groups I do see clear separation (with both vsd and rlog).

I would really like to upload the plots but I don't know to which URL I should upload it.

ADD REPLYlink written 2.2 years ago by lirongrossmann40
Answer: Different PCA plots using rlog and vsd on the same data set
0
gravatar for Michael Love
2.2 years ago by
Michael Love26k
United States
Michael Love26k wrote:

Can you describe the data or the plots? How large of differences, experimental design, etc. There is some description of differences in the vignette.

ADD COMMENTlink written 2.2 years ago by Michael Love26k

Thanks! 

I will try it. 

ADD REPLYlink written 2.2 years ago by lirongrossmann40
Answer: Different PCA plots using rlog and vsd on the same data set
0
gravatar for Wolfgang Huber
2.2 years ago by
EMBL European Molecular Biology Laboratory
Wolfgang Huber13k wrote:

Can you try with selecting the top 100 (or 200, 500) genes, by baseMean? Or also by rowVars of dds, vsd. When you apply PCA to all genes, the 'signal' may be dominated by precarious variations in the many genes with low counts.

Please also try posting the PCA plots.

ADD COMMENTlink written 2.2 years ago by Wolfgang Huber13k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 374 users visited in the last hour