Question: PCA plot of variance stabilized transformation of normalized counts in limma
4 weeks ago
Raymond wrote:


   I used to plot PCA  using DESeq2, and it works great.  In DESeq2, the normalized counts are transformed through vst(variance stabilized transformation, based on the NB variance ~ expectation relationships?) function, rather than the direct log transformed counts.  

  I tried limma-voom in my new datasets, which has more than 500 samples with various factors(sex, Batches, treatments, genotypes, etc).  Then I used plotMDS function:

plotMDS(lcpm[,subsamples], top=500, 
        labels=NULL, dim = c(1,2))

plotMDS with dim=c(1,2) ,c(1,3), c(1,4) , c(2,3), c(2,4),or c(3,4) showed no obvious separations by those known factors.  

   I noticed that plotMDS uses the log transformed TMM normalized counts directly.  As I could also get the mean-variance relationships from efit, why there is no such vst transformed expression data for the PCoA plot? 

  In the PlotMDS plot, if I chose gene.selection = "common"was the output identical as PCA plot with the same log transformed datasets?  To my understanding, if Euclidean distance were applied, PCA and PCoA are identical, are they?


Thanks & regards,


4 weeks ago by Gordon Smyth
4 weeks ago
Gordon Smyth
Walter and Eliza Hall Institute of Medical Research, Melbourne, Australia
Gordon Smyth wrote:

Yes, gene.selection="common" will make the MDS distance equivalent to PCA.

To stabilize the variances for the MDS plot, use cpm() with prior.count=5.

4 weeks ago by Gordon Smyth

Thanks, Gordon. Based on your experience, when would you set 'gene.selection="common" ', and when ' gene.selection="pairwise" '? Is there any rule of thumb? 

written 29 days ago by Raymond

I use "pairwise" unless the number of samples is very large. With a large number of samples, "pairwise" is quadratically slow so I switch to "common".

written 29 days ago by Gordon Smyth

Thanks, Gordon. 

written 28 days ago by Raymond
