ntop parameter in plotPCA(). DESeq2
Entering edit mode
Last seen 9 months ago
United States

Given that the PCA plot is likely to change somewhat depending on the number of genes you decide to specify with the ntop parameter, are there any recommendations on how to best set this value besides arbitrarily setting it at the default of 500/1000? Could including all genes have a negative effect if a lot of the genes have low variance?



deseq2 • 1.7k views
Entering edit mode
Last seen 1 hour ago
United States

"Could including all genes have a negative effect if a lot of the genes have low variance?"

Not really. The low count noise is dealt with by vst() or rlog(). It's up to you what ntop to choose, choosing a small-ish number, e.g. 500, focuses the PCA plot on the most variable genes across samples, which is often the DE genes across condition (although this information is not used to pick the genes in the PCA plot).

Entering edit mode

HI @mikelove. When I increase the ntop value in my data to 1000, 2000, and 3000 respectively my PC1 and PC2 get's worse and worse. I am using VST normalised count (and this happen both at blind=TRUE/FALSE)

  • at default (PC1,PC2)= 52,21 Total variance: 73%
  • at 1000; 44,22 Totalvariance: 66%
  • at 2000: 36 23 Total variance: 59%
  • at 3000: 33,24 : 57%
Entering edit mode

Hmm, I wouldn't say the PCs "get worse". They just show you something else. This has to do with the theory of PCA. When we restrict to the top variance genes, PC1 is typically aligned in this direction, so PC1 makes up most of the variance of this subset of the entire space. When we increase the number of genes we look at, the percent of PC1 goes down. This isn't specific to your dataset, but you'd get something similar with data simulated from a multivariate Gaussian distribution with a certain covariance structure.


Login before adding your answer.

Traffic: 339 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6