Entering edit mode
Mozart
▴
30
@mozart-20625
Last seen 4.2 years ago
Hello there, after reading a few posts here about this particular topic, I guess I solved this 'puzzling' issue and I just need to check that this is the correct way to do this. Essentially, I need to output the PC1 and PC2 from plotPCA function in order to let someone else to play around with an external software these data.
rv <- rowVars(assay(rld))
select <- order(rv, decreasing = TRUE)[seq_len(min(500,length(rv)))]
pca <- prcomp(t(assay(rld)[select, ]))
intgroup.df <- as.data.frame(colData(rld)[, intgroup, drop = FALSE])
pca$x[,1]
pca$x[,2]
Is this the correct way to proceed?
Thanks in advance.
So, you can confirm that apart from the 'intgroup.df' everything is fine?
That is most likely what your colleagues want - yes.
pca$x
contains the rotated component loadings. In the code that you have pasted, though, you are also pre-selecting the genes with highest variance prior to performing PCA withprcomp()
; thus, you are somewhat biasing it.You can also get the same data with my own PCAtools package:
You can remove variables based on variance prior to performing PCA via the
removeVar
parameterThanks Kevin, now I have understood why this DESeq2 plot is defined 'biased'. So, rather than doing the code above I could simply do in the following way:
Is this what you were referring to? Sorry, I am bit rusty!
Yes that is the code, if you are using my PCAtools. You do not have to calculate
rv
, though.