Search
Question: PCA plot of variance stabilized transformation of normalized counts in limma
0
gravatar for Raymond
4 weeks ago by
Raymond0
Raymond0 wrote:

Hi, 

   I used to plot PCA  using DESeq2, and it works great.  In DESeq2, the normalized counts are transformed through vst(variance stabilized transformation, based on the NB variance ~ expectation relationships?) function, rather than the direct log transformed counts.  

  I tried limma-voom in my new datasets, which has more than 500 samples with various factors(sex, Batches, treatments, genotypes, etc).  Then I used plotMDS function:

plotMDS(lcpm[,subsamples], top=500, 
        col=df_annotation$col[subsamples], 
        labels=NULL, dim = c(1,2))

plotMDS with dim=c(1,2) ,c(1,3), c(1,4) , c(2,3), c(2,4),or c(3,4) showed no obvious separations by those known factors.  

   I noticed that plotMDS uses the log transformed TMM normalized counts directly.  As I could also get the mean-variance relationships from efit, why there is no such vst transformed expression data for the PCoA plot? 

  In the PlotMDS plot, if I chose gene.selection = "common"was the output identical as PCA plot with the same log transformed datasets?  To my understanding, if Euclidean distance were applied, PCA and PCoA are identical, are they?

 

Thanks & regards,

Raymond

ADD COMMENTlink modified 4 weeks ago by Steve Lianoglou12k • written 4 weeks ago by Raymond0
0
gravatar for Gordon Smyth
4 weeks ago by
Gordon Smyth35k
Walter and Eliza Hall Institute of Medical Research, Melbourne, Australia
Gordon Smyth35k wrote:

Yes, gene.selection="common" will make the MDS distance equivalent to PCA.

To stabilize the variances for the MDS plot, use cpm() with prior.count=5.

ADD COMMENTlink modified 4 weeks ago • written 4 weeks ago by Gordon Smyth35k

Thanks, Gordon. Based on your experience, when would you set 'gene.selection="common" ', and when ' gene.selection="pairwise" '? Is there any rule of thumb? 

ADD REPLYlink written 29 days ago by Raymond0

I use "pairwise" unless the number of samples is very large. With a large number of samples, "pairwise" is quadratically slow so I switch to "common".

ADD REPLYlink written 29 days ago by Gordon Smyth35k

Thanks, Gordon. 

ADD REPLYlink modified 28 days ago • written 28 days ago by Raymond0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.2.0
Traffic: 159 users visited in the last hour