I'm new to R and am making a heatmap for some RNA sequencing data using p.heatmap. My input data is the Log2CPM of genes across 5 samples (samples in columns, genes in rows). I want to understand whether I should scale my data or not, using the scale() function. And secondly, if I should set scale="row" in the p.heatmap function or not. Here is my code:
install.packages("pheatmap") library(pheatmap) heatmap_trial_2 <- read.csv("Final genes_log2CPM.csv") heatmap_trial_2 <- data.frame(heatmap_trial_2[,-1], row.names=heatmap_trial_2[,1]) sc_1 <-t(scale(t(heatmap_trial_2), center = TRUE, scale = TRUE)) pheatmap(sc_1, kmeans_k = NA, breaks = NA, scale = "none", cluster_rows = FALSE, cluster_cols = FALSE, show_rownames = TRUE, show_colnames = TRUE, colorRampPalette(brewer.pal(9,"BuPu"))(100))
Here is the output I get when I put the above code
However, I noticed that if I set scale = "row" in the p.heatmap code, then the heatmap looks exactly the same regardless of whether i set scale = TRUE or scale = FALSE using the scale function. Here is what the heatmap looks like in that case:
If I don't scale it at all (if I put scale=FALSE and scale="none"), this is what I get:
I do understand the purpose of scaling in general after reading the R documentation for both the functions and some other posts so I know I should be scaling my data, I am just struggling to determine which of these is the correct way to do it for my data. At what step should I perform the "scaling"? Any help would be highly appreciated, thanks!
my session info
sessionInfo( ) R version 4.1.2 (2021-11-01) Platform: x86_64-apple-darwin17.0 (64-bit) Running under: macOS Big Sur 11.6.2 Matrix products: default LAPACK: /Library/Frameworks/R.framework/Versions/4.1/Resources/lib/libRlapack.dylib locale:  en_AU.UTF-8/en_AU.UTF-8/en_AU.UTF-8/C/en_AU.UTF-8/en_AU.UTF-8 attached base packages:  stats graphics grDevices utils datasets methods base other attached packages:  RColorBrewer_1.1-3 viridis_0.6.2 viridisLite_0.4.0 pheatmap_1.0.12 loaded via a namespace (and not attached):  magrittr_2.0.3 tidyselect_1.1.2 munsell_0.5.0 colorspace_2.0-3 R6_2.5.1 rlang_1.0.2 fansi_1.0.3 dplyr_1.0.9  tools_4.1.2 grid_4.1.2 gtable_0.3.0 utf8_1.2.2 DBI_1.1.3 cli_3.3.0 ellipsis_0.3.2 assertthat_0.2.1  tibble_3.1.7 lifecycle_1.0.1 crayon_1.5.1 gridExtra_2.3 purrr_0.3.4 ggplot2_3.3.6 vctrs_0.4.1 glue_1.6.2  compiler_4.1.2 pillar_1.7.0 generics_0.1.2 scales_1.2.0 pkgconfig_2.0.3