Hi,
I'm trying to cluster rows and columns in DESeq2 and show it in a heatmap. I've tried this:
heatmap.2(assay(vst)[select,], col = hmcol,
Rowv = T, Colv = T, scale="row", trace="none", margin=c(8, 8))
I think my problem is revealed when the scale = 'row', the genes don't seem clustered, yet when scale='none' I see high medium and lowly expressed genes in separate clusters. I think this means that the rows are not normalized, while clustering columns, each column is normalized, right?
I think, when the rows aren't normalized the counts that are low tend to cluster to each other and high counts would cluster each other. For example:
geneW, Sample A = 10, B=20
geneX, Sample A = 15, B=15
geneY, Sample A = 1000,B=2000
GeneZ, Sample A=1500,B15000
if each gene is normalized, W and X are closer to Y and Z. But in realized W changes more similar with Y; and W remains constant like with Z
Is my interpretation correct? How would I normalize the rows?
Thanks!
In the DESeq2 workflow, we recommend stabilizing the variance of observed counts first (the FPKM values don't tell you about their precision). Then, as you suggest, we also suggest to remove the mean, if the point is to cluster genes which show similar relative changes across samples (where the absolute count or absolute expression level is not as important as relative differences).