Question

How to solve out the misunstanding of fake "red" "green" on the heatmap

0

Entering edit mode

joseph ▴ 50

@joseph-5658

Last seen 6.5 years ago

I have lots of work need to use heatmap, however I find a problem that sometimes mislead to interpret the data if only by the heatmap visualization. For example,

I have several time points or groups data, the significant differential expression gene from each group should be different.

As, Group A, Group B and Group C,

Group A has significant differential expression gene: aa, bb, cc, dd, ee

Group B has significant differential expression gene: dd, gg, ff, ee, aa,

Group C has significant differential expression gene: tt, uu, ee, aa, dd

When I put them in the same heatmap, which means I need to combine all the gene from all the comparison groups.

All: aa, bb, cc, dd, ee, ff, gg, tt, uu,

Actually, when I present the data in the heatmap, some of gene as gg, tt, uu, they're have high fold change, but the p-vaule is out of 0.05, they're not the significant differential expression gene in Group A. When I plot, the heatmap will mislead people the gg, tt, uu also have very high expression.

I'm trying to sort the data by the p-value, like gg, tt, uu in the Group A, they're p-vaules are larger than 0.05, so I value their fold change to 0 or NA, but when I plot, the function reports the error as "Error in hclust(d, method = method) :
NA/NaN/Inf in foreign function call (arg 11)".

Any suggestion to figure out this problem? Thank you.

pheatmap • 1.0k views

ADD COMMENT • link updated 7.0 years ago by Gavin Kelly ▴ 680 • written 7.0 years ago by joseph ▴ 50

score 0 · Answer 1 · 2017-05-08

I don't think it's necessarily misleading that a bright colour represents a high expression; the heatmap is representing best estimates of fold-changes, rather than significances, and most people interpret them as such. The ability to detect common directions in expression (even if they have failed to show enough consistency to achieve significance) generally outweighs the use of a heatmap to discern significance patterns.

One compromise might be to plot the heatmap across all samples, rather than aggregated replicate groups, so that the viewer would see that the colour pattern within an experimental group varied, thus indicating a potential lack of significance.

However, if you really do want to censor the images, you could run heatmap twice, once with the real values, but setting keep.dendro=TRUE, capture the output. You could then adjust your raw data so that the cells you want to censor are set to zero and recall the heatmap function with this new data, but you set the Rowv and Colv parameters are set to the dendrograms that you captured in the first call to heatmap.