Hi all,
I am trying to perform hierarchical clustering on the results of DESeq2.
My experimental design has 12 different conditions, each with a number of biological replicates.
To do this, I have used vst() on my deseq object, followed by scale() to get Z-scores.
VST <- vst(dds_deseq, blind=FALSE) Counts <- assay(VST) Scaled_counts<-t(scale(t(as.matrix(Counts)), center=TRUE, scale=TRUE))
I then use hclust() to produce the dendograms.
This seems to work well and the dendograms I get in the end match my expectations, however due to the large number of biological replicates, the resulting clusters are "overly detailed" - the heatmap is huge. I would like to simplify this by averaging my biological replicates, and reperforming the clustering on those averages.
I am unsure at what point it would be appropriate to take that average (ie. I would expect that taking an average of Z-scores would distort the data).
It seems to me that there should be a way to average counts in the dds_deseq object, however I do not not know how to do this, and am unsure whether this would also skew the results.
Alternatively, should I take the average at the level of the txi object? If so, could someone suggest how I can do this?
Many thanks!