Question

[DESEQ2] When to average biological replicates for hierarchical clustering

0

Entering edit mode

E • 0

@3732d500

Last seen 5 months ago

United Kingdom

Hi all,

I am trying to perform hierarchical clustering on the results of DESeq2.

My experimental design has 12 different conditions, each with a number of biological replicates.

To do this, I have used vst() on my deseq object, followed by scale() to get Z-scores.

VST <- vst(dds_deseq, blind=FALSE) Counts <- assay(VST) Scaled_counts<-t(scale(t(as.matrix(Counts)), center=TRUE, scale=TRUE))

I then use hclust() to produce the dendograms.

This seems to work well and the dendograms I get in the end match my expectations, however due to the large number of biological replicates, the resulting clusters are "overly detailed" - the heatmap is huge. I would like to simplify this by averaging my biological replicates, and reperforming the clustering on those averages.

I am unsure at what point it would be appropriate to take that average (ie. I would expect that taking an average of Z-scores would distort the data).

It seems to me that there should be a way to average counts in the dds_deseq object, however I do not not know how to do this, and am unsure whether this would also skew the results.

Alternatively, should I take the average at the level of the txi object? If so, could someone suggest how I can do this?

Many thanks!

DESeq2 • 284 views

ADD COMMENT • link updated 5 months ago by James W. MacDonald 66k • written 5 months ago by E • 0

score 0 · Answer 1 · 2024-01-17

0

Entering edit mode

James W. MacDonald 66k

@james-w-macdonald-5106

Last seen 2 days ago

United States

If you fit a cell means model, the coefficients are the average of each group. You could compute the averages from a treatments contrast parameterization as well, but it's easier to just do the former.

ADD COMMENT • link 5 months ago James W. MacDonald 66k