I have been using the pheatmap()
package for a while now and moved out of traditional heatmap.2. However I, recently came across that the pheatmap is actually scaling post column and row dendrogram after the clustering method is specified which is not done by heatmap.2
. So if I just want to plot the dendrogram of the columns with the distances that is used in pheatmap
for my data how do I plot it. I have already used hclust
on my data to extract this information and plot the dendrogram of columns having the tree height but the ordering is a bit different from pheatmap
. So I reckon if I need to use the tree dendrogram of my column it would be ideal to plot the actual dendrogram that the pheatmap
is using for my data post clustering and then using the distance based methods like correlation. How can I make this possible?
Data-set test:
test = matrix(rnorm(200), 20, 10) test[1:10, seq(1, 10, 2)] = test[1:10, seq(1, 10, 2)] + 3 test[11:20, seq(2, 10, 2)] = test[11:20, seq(2, 10, 2)] + 2 test[15:20, seq(2, 10, 2)] = test[15:20, seq(2, 10, 2)] + 4 colnames(test) = paste("Test", 1:10, sep = "") rownames(test) = paste("Gene", 1:20, sep = "")
pheatmap code:
pheatmap(test,scale="row",clustering_distance_cols = "correlation",show_rownames= T,show_colnames=T,color=col,cluster_col=T,fontsize_row = 6,fontsize_col = 7,clustering_method = "ward.D2",border_color = NA,cellwidth = NA,cellheight = NA)
I used the below code for plotting the dendrogram
of the columns with hclust
but this is not exactly what pheatmap gives so I should use the one from pheatmap.
dr<-mydist(scale(test))
hr<-hclust(dr, method = "ward.D2", members=NULL)
plot(hr,cex=0.8)
mydist
function since am interested in correlation distances post-clustering
mydist<-function(d)
{
cormia<-cor(d,method="pearson")
cormia[whichis.na(cormia))]<-1
dismia<-as.dist(1 - cormia)
dismia
}
The dendrogram created with ward.D2 and correlation dist of the clusters is reflective of the heatmap I get from heatmap.2()
. But the one I get in the pheatmap
is having the same classification as the heatmap.2
and dendrogram in 2 broad conditions but the ordering of the columns is not exactly the same since pheatmap
scales the df
post clustering method and post creating the col dendrograms
. So how can I retrieve the column tree
from the pheatmap
and plot the column dendrogram
only from it for my samples which are actually the columns in the df?