Averaging over biological replicates with DESeq2 for heatmap plotting
1
0
Entering edit mode
chighfi • 0
@chighfi-17631
Last seen 5.5 years ago

I have a question on averaging biological replicates together. the code below plots data for each sample. When and how do I combine my biological replicates for plotting? I would like to combine over the CONDITION column and have tried a man ways. Thought you might have an answer. 

colData(rld)

DataFrame with 6 rows and 10 columns
          sampleName  fileName     LINE EXPOSURE CONDITION   TISSUE       REP
            <factor>  <factor> <factor> <factor>  <factor> <factor> <integer>
A1H_Acute  A1H_Acute A1H_Acute      CSB    Acute   Cocaine        H         1
A2H_Acute  A2H_Acute A2H_Acute      CSB    Acute   Cocaine        H         2
A3H_Acute  A3H_Acute A3H_Acute      CSB    Acute   Cocaine        H         3
B1H_Acute  B1H_Acute B1H_Acute      CSB    Acute Sucrose_C        H         1
B2H_Acute  B2H_Acute B2H_Acute      CSB    Acute Sucrose_C        H         2
B3H_Acute  B3H_Acute B3H_Acute      CSB    Acute Sucrose_C        H         3
               SEX individual        sizeFactor
          <factor>   <factor>         <numeric>
A1H_Acute        M         AM  1.23895646591537
A2H_Acute        M         AM 0.709636373005609
A3H_Acute        M         AM  1.39159832544129
B1H_Acute        M         BM 0.738832280319489
B2H_Acute        M         BM 0.908432365721923
B3H_Acute        M         BM  1.24898796150053

 

 

dds <- DESeqDataSetFromMatrix(countData = AcuteCountsMheadCO, colData = AcuteSampleTable1MheadCO, design = ~ CONDITION )

myTest<-DESeq(dds)

rld <- rlog(myTest, blind=F)

select <- order(rowMeans(counts(myTest,normalized=TRUE)),

                decreasing=TRUE)[1:20]

df <- as.data.frame(colData(myTest)[,c("CONDITION","TISSUE")])

pheatmap(assay(rld)[select,], cluster_rows=FALSE, show_rownames=FALSE,

         cluster_cols=FALSE, annotation_col=df)

deseq2 pheatmap average • 1.5k views
ADD COMMENT
0
Entering edit mode
@james-w-macdonald-5106
Last seen 9 hours ago
United States

There is usually no profit in doing a heatmap after combining replicates unless you have way more subjects than that. If you had maybe 20 different groups with like 6 replicates per group, it might make sense to use the mean expression values because in that scenario you might want to show broad differences in groups without confusing the issue with all those columns. But in your case you will have a 6-column heatmap, and using all the samples will allow people to see how similar the samples are, within each group. Collapsing that to a 2-column heatmap is A.) Boring and B.) Obscures information that people may want to see.

Also, unless you are really trying to show the top most highly expressed genes, your code doesn't make sense to me. In this scenario a heatmap is usually intended to show something about the set of differentially expressed genes, rather than the most highly expressed (which are probably just housekeeping genes that aren't even changing expression).

ADD COMMENT
0
Entering edit mode

Hi James, 

I realize my code is not correct in the aspect of plotting the most significant DEGenes. I am very new to DEseq2 and making heat maps. In the end I want to generate a heat map for the "pulled" samples on only the most significant genes. In this case it was 104 genes. 

ADD REPLY

Login before adding your answer.

Traffic: 972 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6