Question

PCA plot on DE genes do not separate samples. DESeq2.

0

Entering edit mode

g.atla ▴ 10

@gatla-9491

Last seen 7.2 years ago

I am running the DE analysis using Deseq2. When I plot the PCA of differentially expressed genes ( ntop=500) using the command below, I do not see a clear difference between in treated vs untreated.

de <- rownames(resdds[ (resdds$padj<0.05) & (!is.na((resdds)$pvalue)) & (!is.na(resdds$padj)),])
data <- plotPCA(vsd, intgroup="treat", returnData=TRUE, ntop=500)
percentVar <- round(100 * attr(data, "percentVar"))
ggplot(data, aes(PC1, PC2, color=treat)) + geom_point(size=3) + geom_text_repel(aes(label=row.names(data))) + xlab(paste0("PC1: ",percentVar[1],"% variance")) + ylab(paste0("PC2: ",percentVar[2],"% variance"))

Are my results are reliable ? Here is the MA Plot and the GO looks god.

Tanks in advance.

deseq2 pca • 3.7k views

ADD COMMENT • link updated 8.0 years ago by James W. MacDonald 65k • written 8.0 years ago by g.atla ▴ 10

score 4 · Answer 1 · 2016-04-12

I disagree. You get almost complete separation between the groups on the second principal component. Anyway, principal components is looking at the differences in samples based on a linear combination of the top 500 genes, whereas each univariate comparison is based on individual genes. There will always be differences between an aggregation of data and individual observations, and the former doesn't invalidate the latter.