Averaging over biological replicates with DESeq2 for heatmap plotting
1
0
Entering edit mode
chighfi • 0
@chighfi-17631
Last seen 6.4 years ago

I have a question on averaging biological replicates together. the code below plots data for each sample. When and how do I combine my biological replicates for plotting? I would like to combine over the CONDITION column and have tried a man ways. Thought you might have an answer. 

colData(rld)

DataFrame with 6 rows and 10 columns sampleName fileName LINE EXPOSURE CONDITION TISSUE REP <factor> <factor> <factor> <factor> <factor> <factor> <integer> A1H_Acute A1H_Acute A1H_Acute CSB Acute Cocaine H 1 A2H_Acute A2H_Acute A2H_Acute CSB Acute Cocaine H 2 A3H_Acute A3H_Acute A3H_Acute CSB Acute Cocaine H 3 B1H_Acute B1H_Acute B1H_Acute CSB Acute Sucrose_C H 1 B2H_Acute B2H_Acute B2H_Acute CSB Acute Sucrose_C H 2 B3H_Acute B3H_Acute B3H_Acute CSB Acute Sucrose_C H 3 SEX individual sizeFactor <factor> <factor> <numeric> A1H_Acute M AM 1.23895646591537 A2H_Acute M AM 0.709636373005609 A3H_Acute M AM 1.39159832544129 B1H_Acute M BM 0.738832280319489 B2H_Acute M BM 0.908432365721923 B3H_Acute M BM 1.24898796150053

 

 

dds <- DESeqDataSetFromMatrix(countData = AcuteCountsMheadCO, colData = AcuteSampleTable1MheadCO, design = ~ CONDITION )

myTest<-DESeq(dds)

rld <- rlog(myTest, blind=F)

select <- order(rowMeans(counts(myTest,normalized=TRUE)),

decreasing=TRUE)[1:20]

df <- as.data.frame(colData(myTest)[,c("CONDITION","TISSUE")])

pheatmap(assay(rld)[select,], cluster_rows=FALSE, show_rownames=FALSE,

cluster_cols=FALSE, annotation_col=df)

deseq2 pheatmap average • 1.8k views
ADD COMMENT
0
Entering edit mode
@james-w-macdonald-5106
Last seen 1 hour ago
United States

There is usually no profit in doing a heatmap after combining replicates unless you have way more subjects than that. If you had maybe 20 different groups with like 6 replicates per group, it might make sense to use the mean expression values because in that scenario you might want to show broad differences in groups without confusing the issue with all those columns. But in your case you will have a 6-column heatmap, and using all the samples will allow people to see how similar the samples are, within each group. Collapsing that to a 2-column heatmap is A.) Boring and B.) Obscures information that people may want to see.

Also, unless you are really trying to show the top most highly expressed genes, your code doesn't make sense to me. In this scenario a heatmap is usually intended to show something about the set of differentially expressed genes, rather than the most highly expressed (which are probably just housekeeping genes that aren't even changing expression).

ADD COMMENT
0
Entering edit mode

Hi James, 

I realize my code is not correct in the aspect of plotting the most significant DEGenes. I am very new to DEseq2 and making heat maps. In the end I want to generate a heat map for the "pulled" samples on only the most significant genes. In this case it was 104 genes. 

ADD REPLY

Login before adding your answer.

Traffic: 625 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6