Dear all,
can anyone help me with this question which I added in Biostar (https://www.biostars.org/p/197038/)
I pasted here to make it easier:
I did DEG analysis for two condition with 5 time points using deseq2. In heatmap I have one colour for each DEG with lowest p- value which represent log2 fold change (wald test) for each deg in time 0. My question is that I have 4 sample in time point 1 vs 4 sample in time 0. But I see only one colour for all, as time0-vs-1. Is this colour from mean or median of all LFCs of 4 samples? Or there is an algorithm behind that in heatmap? Many thank in advance for your help, Cheers, Rahel
Dear Michael,
Many thanks for your reply. I have two treatment response (R, NR) and 7 time points. I want to see DEGs in responses to treatment over time courses. Here is the Deseq2 comment:
> ddsTC<-DESeqDataSetFromHTSeqCount(sampleTable=sampleTable, directory=directory, design=~IFX_response+time+ IFX_response:time)
> ddsTC <- DESeq(ddsTC, test="LRT", reduced = ~IFX_response + time)
> colData(ddsTC)
DataFrame with 61 rows and 4 columns
time IFX_response sizeFactor replaceable
<factor> <factor> <numeric> <logical>
01_0h_IFX_NR_P21_F02312.txt 01_0h IFX_NR 1.206895 FALSE
01_0h_IFX_NR_P23_F02269.txt 01_0h IFX_NR 1.024494 FALSE
... ... ... ... ...
07_14w_IFX_R_P13_E02898.txt 07_14w IFX_R 0.7851559 TRUE
07_14w_IFX_R_P14_E02900.txt 07_14w IFX_R 1.0228529 TRUE
> resTC <- results(ddsTC)
> betas <- coef(ddsTC)
> colnames(betas)
But when I make the pheatmap with exact command from Deseq2:
> topGenes <- head(order(resTC$padj),50)
> mat <- betas[topGenes, -c(1,2)]
> thr <- 3
> mat[mat < -thr] <- -thr
> mat[mat > thr] <- thr
> pheatmap(mat,breaks=seq(from=-thr, to=thr, length=101),border_color="NA",cluster_col=FALSE)
I get one color bulk for each of time point in comparison to time 0.
http://i.imgur.com/PqrnJQY.jpg
My question is that how from 4 replicate per time point, I got one colour bulk per time point? Is that correct to the genes are picked by p-value, colored by LFC and each bulk is presenting the mean normalized rad count of 4 samples?
I hope I made it clear.
Cheers,
Rahel
[2]: http://i.imgur.com/PqrnJQY.jpg
Yes, in the code above you picked genes by adjusted p value, and the color is the log2 fold change.
But, no, the color in the cell is not the mean normalized read count of the 4 samples.
The color is the log2 fold change estimated by comparing the 4 samples with the other 4 samples. The log2 fold change is a parameter estimated by DESeq2. The details on how this is estimated can be found in the vignette or for full details you can read the DESeq2 paper.
Many many thanks for your prompt reply, and also for such an amazing program which helped me a lot through my work. I get my answer and I will read your paper before sending the next question ...
Cheers,
Rahel