I am working with a set of samples and around 4000 genes being expressed across these. (no control).
- I have a counts df with samples as columns and genes as rows
- I have a coldata df = samples as rows and observations as columns
I have plotted a PCA using the DESEq
plotPCA(vsd, intgroup=c("condition", "type")) pcaData <- plotPCA(vsd, intgroup=c("condition", "type"), returnData=TRUE) percentVar <- round(100 * attr(pcaData, "percentVar")) ggplot(pcaData, aes(PC1, PC2, color=condition, shape=type)) + geom_point(size=3) + xlab(paste0("PC1: ",percentVar,"% variance")) + ylab(paste0("PC2: ",percentVar,"% variance")) + coord_fixed()
I would like to plot a biplot on the PCA using my coldata environmental variables so they point explaining the variability.
- Is there a way I could include the eigenvectors or the factors from my "coldata" which are potentially explaining the PCA variability of my samples?
- In what format do I need to have my data to (ie. only numeric, transpose, merge coldata+count data?
library(PCAtools) biplot(pcaData) # Error in nrow(y) : argument "y" is missing, with no default sessionInfo( ) R version 4.0.3 (2020-10-10) Platform: x86_64-apple-darwin17.0 (64-bit) Running under: macOS Catalina 10.15.4