How to find genes causing the high variance of one replicate in the PCA
1
1
Entering edit mode
NMostajo ▴ 10
@nmostajo-10900
Last seen 6.3 years ago
Germany

Hello,

I have checked the PCA from my sRNAseq data and there is one replicate which goes over 67% away from all my samples (1/18).

I know that the PCA plots the genes with the top variance, and this sample seems to not have crazy values in all the genes (checked random), and with PCA of different biotype class genes (miRNAs and snoRNAs)

Pvars <- rowVars(assay(rld))

topVarGenes <- head(order(rowVars(assay(rld)), decreasing = TRUE),35)

with this I found the genes with the highest variance, but I do not seem to identify the genes that are causing the "crazy" behavior of this sample.

I do not want to throw away the sample because I only have 3 replicates, and the behavior seems to be caused by a few genes. Also the cooks plot shows it as the other samples.

Any suggestion on how to find the genes that are causing the extreme behavior in one replicate?

Thank you!

pca variance deseq2 gene_id • 994 views
2
Entering edit mode
@mikelove
Last seen 1 day ago
United States

Here's one quick approach to look for genes where a sample has extreme counts. Use vst(), rlog() or normTransform() and then extract the transformed values with assay():

mat <- assay(vsd)
zscores <- t(scale(t(mat)))
hist(zscores[ ,idx ]) # where idx is the number of the sample
gene.idx <- which(zscores[,idx] > x) # where x is a large value
plotCounts(dds, gene.idx[1])
0
Entering edit mode

Thanks! I got my gene IDs