Question

Heatmap of the Count Matrix

0

Entering edit mode

Ferdinand David • 0

@b39b3713

Last seen 6 months ago

United Kingdom

When following your workflow in https://bioconductor.org/packages/release/bioc/vignettes/DESeq2/inst/doc/DESeq2.html#standard-workflow, why do you use rowMeans to display the Heatmap of count matrix? Can I use data from PCA to select the top 20 genes? In my case, I have 6,770 genes. If I choose only the top 20 genes, I assume it does not accurately represent the genes, so I want to use PCA loadings. Does it make sense?

library("pheatmap")
select <- order(rowMeans(counts(dds,normalized=TRUE)),
                decreasing=TRUE)[1:20]
df <- as.data.frame(colData(dds)[,c("condition","type")])
pheatmap(assay(ntd)[select,], cluster_rows=FALSE, show_rownames=FALSE,
         cluster_cols=FALSE, annotation_col=df)

sessionInfo( )

rnaseqGene • 2.3k views

ADD COMMENT • link updated 3 months ago by ArnulfoByrd • 0 • written 6 months ago by Ferdinand David • 0

0

Entering edit mode

Perhaps PCA loadings offer a more comprehensive perspective. Now, thinking back, I recall struggling with a similar data reduction dilemma in my student days, trying to represent complex network traffic patterns using only a few key metrics. It felt like navigating a tricky Slither io level, trying to capture the essence without getting swallowed by the details.

ADD REPLY • link 3 months ago ArnulfoByrd • 0

score 1 · Answer 1 · 2025-06-04

1

Entering edit mode

ATpoint ★ 5.0k

@atpoint-13662

Last seen 7 days ago

Germany

This section of the vignette is merely a very basic data exploration and by no means set into stone. The rowMeans is just an ordering function here, do whatever you feel is appropriate to explore your data properly.

ADD COMMENT • link 6 months ago ATpoint ★ 5.0k