Having TMM for GSEA
1
0
Entering edit mode
AZ ▴ 30
@fereshteh-15803
Last seen 19 months ago
United Kingdom

Hi

In GSEA manual says

Normalizing RNA-seq quantification to support comparisons of a feature's expression levels across samples is important for GSEA. Normalization methods (such as, TMM, geometric mean) which operate on raw counts data should be applied prior to running GSEA. Tools such as DESeq2 can be made to produce properly normalized data (normalized counts) which are compatible with GSEA

So I have two groups (n=9 versus n=24)

I put my raw counts matrix in this formula

dge <- DGEList(M)
dge <- calcNormFactors(dge)
logCPM <- cpm(dge, log=TRUE)

Does logCPM gives proper input for GSEA?

edger deseq2 • 3.4k views
ADD COMMENT
3
Entering edit mode
@gordon-smyth
Last seen 18 minutes ago
WEHI, Melbourne, Australia

Yes, it's fine.

Alternatively you could try camera() in edgeR which has analogous functionality to GSEA.

ADD COMMENT
0
Entering edit mode

Sorry Gordon Smyth

Why people say TMM is not suitable in any context while GSEA accepts TMM

EdgeR --> TMM (Trimmed Median of M-values)
DESeq2 --> Geometric mean
Both are debatable and not suitable for every context.

Perhaps take a look at the output of ?DESeq2::counts

...

Description:

The counts slot holds the count data as a matrix of non-negative integer count values, one row for each observational unit (gene or the like), and one column for each sample.

...

normalized: logical indicating whether or not to divide the counts by the size factors or normalization factors before returning (normalization factors always preempt size factors)

...

Author(s): Simon Anders
ADD REPLY
2
Entering edit mode

I'm not sure sure why people love to have a debate about normalization methods but it is irrelevant here. GSEA uses a very robust permutation algorithm and it is not sensitive to the particular normalization method used. Any reasonable normalization method that produces logCPM type values will be fine for GSEA. edgeR's calcNormFactors and cpm functions are certainly suitable, as the GSEA documentation itself already tells you.

ADD REPLY
1
Entering edit mode

Why people say TMM is not suitable in any context while GSEA accepts TMM

Please read my response: https://www.biostars.org/p/456240/#456346

For clarity, for GSEA, please use the log CPM values. If you are going the DESeq2 route, then use the variance-stabilised expression levels.

ADD REPLY

Login before adding your answer.

Traffic: 801 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6