Showing average of replications on heatmap
3
0
Entering edit mode
John ▴ 30
@john-9676
Last seen 5.9 years ago

Hi,

I followed deseq2 tutorial http://www.bioconductor.org/help/workflows/rnaseqGene/ and the heatmap shows all replications. I would like to do heatmap on the average of replication. Could you please advice?

Best,

John.

deseq2 • 3.1k views
ADD COMMENT
1
Entering edit mode
John ▴ 30
@john-9676
Last seen 5.9 years ago

Hi Michael,

Thank you for the help. The experiment is a test for salinity and light response in fish.

I have two type of treatments. One treatment is over a long period of time - light called it long day(LD-control), short day (SD), and no light (NL-control) and another one is for salinity called it freshwater (FW) vs saltwater (SW). The experiment was conducted six times at different time interval and we call it N1,N2,..N6. We collected 4 samples at each time interval. 2 of them were challenged with salinity (SW) (for 1 day) and 2 of them with freshwater (FW). This means two replication for each salinity challenge.  

N1 LD FW

N1 LD FW

N1 LD SW

N1 LD SW

N1 SD FW

N1 SD FW

N1 SD SW

N1 SD SW 

N1 NL FW

N1 NL FW

N1 NL SW

N1 NL SW

etc for the rest N2, N3, ... N6

As you can see, we have two level replication. One for light and another for salinity treatment. We want to experiment the fish response to salinity under long day light, short day light and darkness. I need heatmap for the average or if you have other suggestions. Is it difficult to do that in deseq2? I would be nice if there is an option to choose the average over the replication to show on heatmps.

I hope this is clear. 

Thanks again.

J.

ADD COMMENT
0
Entering edit mode

I would definitely want to see the within group-variability here. I feel like 72 samples is not too many for a heatmap, given that there are typically 100s of rows.

That said, here is some example R code for collapsing values row-by-row:

m <- matrix(1:20, ncol=4)
t(apply(m, 1, function(row) c(mean(row[1:2]), mean(row[3:4]))))

This is not the fastest possible implementation, but one where you can see what is going on.

ADD REPLY
0
Entering edit mode
@mikelove
Last seen 10 hours ago
United States

As I said in this earlier post, I prefer that a heatmap show the values (whether transformed normalized counts, FPKM or whatever) for all the replicates. This gives a sense of the within-group variability as well as the differences across groups. 

A: DESeq heatmap based on threshold

If you summarize to a single value across biological replicates, there are probably better ways to visualize than a heatmap. Can you give more description of what you want to show? How many groups? Can you briefly describe what the experiment looks like?

ADD COMMENT
0
Entering edit mode
John ▴ 30
@john-9676
Last seen 5.9 years ago

Hi Michael,

Thank you so much. I had to provide a new annotation_col values as the number of columns changed. Do you think the following is correct? 

res <- na.omit(res)
topVarGenes  <- which(res$padj < 0.1 & res$log2FoldChange > 0)
mat <- assay(vd)[ topVarGenes, ]
mat  <- t(apply(mat, 1, function(row) c(mean(row[1:2]), mean(row[3:4]), mean(row[5:6]), mean(row[7:8]))))
colnames(mat)  <- c("LD_FW","LD_SW","SD_FW","SD_SW")

mat <- mat - rowMeans(mat)

df <- data.frame(c("FW","SW","FW","SW"),c("LD","LD","SD","SD"))
rownames(df)  <- c("LD_FW","LD_SW","SD_FW","SD_SW")
colnames(df) <- c("salinity", "daylength")

pheatmap(mat, color = colorRampPalette(rev(brewer.pal(n = 9, name = "RdYlGn")))(100), cluster_rows=TRUE, show_rownames=FALSE, cluster_cols=TRUE, annotation_col=df)

Thanks for the help.

J.

ADD COMMENT
0
Entering edit mode
Yes.
ADD REPLY

Login before adding your answer.

Traffic: 732 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6