Unsupervised heatmap use tpm or rlog gave different results
1
0
Entering edit mode
@jarod_v6liberoit-6654
Last seen 5.2 years ago
Italy

I interested on unsupervised clustering of my samples. I have made differential expression analysis using deseq2 after import data as tximport vignette from rsem.

I try to compare two heatmap starting from counting  counts or rlog. I found two different results. Is it normal?

norm.counts <- counts(dds, normalized=TRUE)
log.norm.counts <- log2(norm.counts + 1)

topVarGenes <- order(-rowVars(log.norm.counts)[0:1000])

mat<-log.norm.counts[topVarGenes,]
mat<-mat -rowMeans(mat)

pheatmap(mat,method="complete",main = " ",color=my_pal2, show_rownames = F,
         annotation_legend = FALSE, legend=T, cluster_cols=TRUE,cexRow=0.55,
         cluster_rows = T,breaks = quantile(mat,seq(0,1,length.out = length(my_pal2)+1)))

topVarGenes1 <- order(-rowVars(assay(rld)))[0:1000]
mat1 <- assay(rld)[ topVarGenes1, ]
mat1<- mat1 - rowMeans(mat1)

pheatmap(mat1,method="complete",main = "Unsupervised 1000 genes ",color=my_pal2, show_rownames = F,annotation_legend = FALSE, legend=T, cluster_cols=TRUE,breaks = quantile(mat1,seq(0,1,length.out = length(my_pal2)+1))

The results are different. What is the error here?

 

 

deseq2 tpm heatmap pheatmap • 2.1k views
ADD COMMENT
0
Entering edit mode
@mikelove
Last seen 1 hour ago
United States

That they are different is a result described in the DESeq2 paper, and there we show from simulations that rlog gave better performance in clustering compared to log2(normalized count + 1).

Here's the DESeq2 paper:

https://genomebiology.biomedcentral.com/articles/10.1186/s13059-014-0550-8

You can see Fig 5 for a qualitative comparison and Figure S17 for the simulation results.

ADD COMMENT

Login before adding your answer.

Traffic: 999 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6