Question: Averaging over biological replicates with DESeq2 for heatmap plotting
gravatar for chighfi
14 months ago by
chighfi0 wrote:

I have a question on averaging biological replicates together. the code below plots data for each sample. When and how do I combine my biological replicates for plotting? I would like to combine over the CONDITION column and have tried a man ways. Thought you might have an answer. 


DataFrame with 6 rows and 10 columns
          sampleName  fileName     LINE EXPOSURE CONDITION   TISSUE       REP
            <factor>  <factor> <factor> <factor>  <factor> <factor> <integer>
A1H_Acute  A1H_Acute A1H_Acute      CSB    Acute   Cocaine        H         1
A2H_Acute  A2H_Acute A2H_Acute      CSB    Acute   Cocaine        H         2
A3H_Acute  A3H_Acute A3H_Acute      CSB    Acute   Cocaine        H         3
B1H_Acute  B1H_Acute B1H_Acute      CSB    Acute Sucrose_C        H         1
B2H_Acute  B2H_Acute B2H_Acute      CSB    Acute Sucrose_C        H         2
B3H_Acute  B3H_Acute B3H_Acute      CSB    Acute Sucrose_C        H         3
               SEX individual        sizeFactor
          <factor>   <factor>         <numeric>
A1H_Acute        M         AM  1.23895646591537
A2H_Acute        M         AM 0.709636373005609
A3H_Acute        M         AM  1.39159832544129
B1H_Acute        M         BM 0.738832280319489
B2H_Acute        M         BM 0.908432365721923
B3H_Acute        M         BM  1.24898796150053



dds <- DESeqDataSetFromMatrix(countData = AcuteCountsMheadCO, colData = AcuteSampleTable1MheadCO, design = ~ CONDITION )


rld <- rlog(myTest, blind=F)

select <- order(rowMeans(counts(myTest,normalized=TRUE)),


df <-[,c("CONDITION","TISSUE")])

pheatmap(assay(rld)[select,], cluster_rows=FALSE, show_rownames=FALSE,

         cluster_cols=FALSE, annotation_col=df)

deseq2 average pheatmap • 246 views
ADD COMMENTlink modified 14 months ago by James W. MacDonald52k • written 14 months ago by chighfi0
Answer: Averaging over biological replicates with DESeq2 for heatmap plotting
gravatar for James W. MacDonald
14 months ago by
United States
James W. MacDonald52k wrote:

There is usually no profit in doing a heatmap after combining replicates unless you have way more subjects than that. If you had maybe 20 different groups with like 6 replicates per group, it might make sense to use the mean expression values because in that scenario you might want to show broad differences in groups without confusing the issue with all those columns. But in your case you will have a 6-column heatmap, and using all the samples will allow people to see how similar the samples are, within each group. Collapsing that to a 2-column heatmap is A.) Boring and B.) Obscures information that people may want to see.

Also, unless you are really trying to show the top most highly expressed genes, your code doesn't make sense to me. In this scenario a heatmap is usually intended to show something about the set of differentially expressed genes, rather than the most highly expressed (which are probably just housekeeping genes that aren't even changing expression).

ADD COMMENTlink written 14 months ago by James W. MacDonald52k

Hi James, 

I realize my code is not correct in the aspect of plotting the most significant DEGenes. I am very new to DEseq2 and making heat maps. In the end I want to generate a heat map for the "pulled" samples on only the most significant genes. In this case it was 104 genes. 

ADD REPLYlink written 14 months ago by chighfi0
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 228 users visited in the last hour