17 months ago by
I have never heard of Z-scale data, so can't speak to that. But a z-score is computed by centering and scaling the data by the standard deviation, which is what the scale function does, so long as you use the defaults.
As for generating a heatmap, the ComplexHeatmap package is a reasonable choice. How you process your data prior to generating a heatmap is dependent on how you want the resulting colors to be interpreted. To generate the colors for the heatmap, by definition the median of the values is set to be the 'middle' of the data, and the colors represent how much the data diverges from the median value.
If you use logCPM, then the median value will be somewhere around say 6-7, and the colors will represent how much a particular gene, in a particular sample diverges from that value. So the apparently down-regulated genes will be down-regulated as compared to the median of all the genes (and the same for the up-regulated genes). This may or may not be useful. But the differences can be interpreted as log fold changes, so that part is OK.
If you use z-scores, then the colors represent the divergence of a particular gene in a particular sample, as compared to the mean value for that gene over all samples. In units of standard deviations. So you can readily see which samples are going up or down for each gene, but it's harder to interpret the amount they are changing because it's dependent on how large the standard deviation for that gene is.