Question

DESeq2 FPKM normalization

1

Entering edit mode

ribioinfo ▴ 100

@ribioinfo-9434

Last seen 3.7 years ago

Hi if I use the FPKM I can compare the expression across different samples and different experiments.

With DESeq2 I can compare the expression of the genes that are in the normalized table. How can I use the DESeq2 normalization and to compare the expression of a gene in samples that are in different analyses?

Thanks

deseq2 normalization • 12k views

ADD COMMENT • link 8.2 years ago ribioinfo ▴ 100

score 1 · Answer 1 · 2016-02-08

1

Entering edit mode

Wolfgang Huber ★ 13k

@wolfgang-huber-3550

Last seen 18 days ago

EMBL European Molecular Biology Laborat…

The DESeq normalisation is intended for relatively precise quantitative comparisons of samples that were consistently processed in the same experiment or study. It is not intended for comparisons across heterogeneous experiments/studies.

Methods for the latter include FPKM, TPM.

But note that such comparisons then tend to be of a more qualitative nature. Because of 'batch effects', it could require a lot of statistical finesse to meaningfully apply e.g. ahypothesis test or a (generalised) linear model approach. Nothing is impossible, but certainly it's not easy.

ADD COMMENT • link 8.2 years ago Wolfgang Huber ★ 13k

0

Entering edit mode

So if we need to compare between two sets of data from two different experiments, is it worth using COMBAT to remove batch effect and then use edgeR to normalize by TPM and DE analysis?

ADD REPLY • link 5.6 years ago ag1805x ▴ 80

score 0 · Answer 2 · 2016-02-09

0

Entering edit mode

ribioinfo ▴ 100

@ribioinfo-9434

Last seen 3.7 years ago

Thank you. I have found this post: /Using DESeq normalized gene count to replace FPKM?

where there is this sentence:

"If you want to have the same scale as typical FPKM values (and so have better comparability across experiments), you could then divide everything by something like geometric mean of the total read counts of all samples / 1 million"

But why have I to use the geometric mean of the total read counts of all samples? If I want something like the FPKM should I not divide each sample by the sum of its read counts?

Thank you

ADD COMMENT • link 8.2 years ago ribioinfo ▴ 100

0

Entering edit mode

The column sum is not a robust estimator for the sequencing depth. But you can choose this option in fpkm() if you prefer.

ADD REPLY • link 8.2 years ago Michael Love 41k