DESeq2 - Plot Nr. Genes vs Variance explained in PCA
1
0
Entering edit mode
@andrebolerbarros-16788
Last seen 20 hours ago
Portugal

Hey everyone,

I know the plotPCA function from DESeq2uses, by default, only the 500 most variable genes. I was wondering if it makes sense, or if anyone has done, a plot where we check the explained variance by PC1 and PC2 as a function of the number of genes considered.

Something like this: enter image description here

Where X is the Number of Genes considered and Y is the sum of variance explained by the first 2 PC's.

Thank you in advance!

DESeq2 • 288 views
ADD COMMENT
1
Entering edit mode
@mikelove
Last seen 17 hours ago
United States

One thing a little tricky about this analysis is that the denominator is changing (total variance in the top 'N' genes).

ADD COMMENT
0
Entering edit mode

Indeed, we are constantly changing the amount of variance in our data. But, this was the best way to get a better "grasp" on what would be the right amount of genes that, at that level of variance, a 2 PC PCA plot could better explain that variance - does this make sense? Would you do it in a different way?

I was maybe thinking about including more PC's (up to 3 or 4), which we can then plot in pairwise fashion (PC1 vs PC2, PC1 vs PC3, ...). What do you think?

ADD REPLY
1
Entering edit mode

I mean exploring your data in many ways is always a good idea (here I don't mean doing a bunch of null hypothesis testing, but EDA), can't go wrong.

ADD REPLY

Login before adding your answer.

Traffic: 563 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6