I have been using the pheatmap() package for a while now and moved out of traditional heatmap.2. However I, recently came across that the pheatmap is actually scaling post column and row dendrogram after the clustering method is specified which is not done by heatmap.2. So if I just want to plot the dendrogram of the columns with the distances that is used in pheatmap for my data how do I plot it. I have already used hclust on my data to extract this information and plot the dendrogram of columns having the tree height but the ordering is a bit different from pheatmap. So I reckon if I need to use the tree dendrogram of my column it would be ideal to plot the actual dendrogram that the pheatmap is using for my data post clustering and then using the distance based methods like correlation. How can I make this possible?
Data-set test:
test = matrix(rnorm(200), 20, 10)
test[1:10, seq(1, 10, 2)] = test[1:10, seq(1, 10, 2)] + 3
test[11:20, seq(2, 10, 2)] = test[11:20, seq(2, 10, 2)] + 2
test[15:20, seq(2, 10, 2)] = test[15:20, seq(2, 10, 2)] + 4
colnames(test) = paste("Test", 1:10, sep = "")
rownames(test) = paste("Gene", 1:20, sep = "")
pheatmap code:
pheatmap(test,scale="row",clustering_distance_cols = "correlation",show_rownames= T,show_colnames=T,color=col,cluster_col=T,fontsize_row = 6,fontsize_col = 7,clustering_method = "ward.D2",border_color = NA,cellwidth = NA,cellheight = NA)
I used the below code for plotting the dendrogram of the columns with hclust but this is not exactly what pheatmap gives so I should use the one from pheatmap.
dr<-mydist(scale(test))
hr<-hclust(dr, method = "ward.D2", members=NULL)
plot(hr,cex=0.8)
mydist function since am interested in correlation distances post-clustering
mydist<-function(d)
{
cormia<-cor(d,method="pearson")
cormia[whichis.na(cormia))]<-1
dismia<-as.dist(1 - cormia)
dismia
}
The dendrogram created with ward.D2 and correlation dist of the clusters is reflective of the heatmap I get from heatmap.2(). But the one I get in the pheatmap is having the same classification as the heatmap.2 and dendrogram in 2 broad conditions but the ordering of the columns is not exactly the same since pheatmap scales the df post clustering method and post creating the col dendrograms. So how can I retrieve the column tree from the pheatmap and plot the column dendrogram only from it for my samples which are actually the columns in the df?
