Question

Is it reasonable to use rlog difference to reflect differential expression of a gene across samples, and rlog value to reflect the expression level

0

Entering edit mode

Yang_Zheng_Neuro • 0

@yang_zheng_neuro-14279

Last seen 5.0 years ago

HMS

Hi,

I'm trying to use heatmap to show the expression pattern of genes of interest across samples(cell types). I landed on two options here:

1) to use CPM(counts per million) row z-score and base_mean CPM as shown here: https://drive.google.com/file/d/1KVP-flOnINjUOOTWbwVIacJITuXH68HU/view?usp=sharing

2) to use rlog difference (rlog-mean_rlog(same gene across samples)) and base_mean rlog, as shown in the second figure :https://drive.google.com/file/d/1rlHxaLRUW5K7CSk-OCCj_E0LNGY_8YjY/view?usp=sharing

My question is:

is it reasonable to use rlog value to represent expression? Is the difference comparable? say rlog_difference 2 generally reflects bigger change than rlog_difference 1.

I know that it's not completely correlated with counts value, but I think it's the estimation of the 'true' expression.

Thank you very much.

deseq2 rnaseq • 605 views

ADD COMMENT • link updated 6.3 years ago by Michael Love 41k • written 6.3 years ago by Yang_Zheng_Neuro • 0

score 1 · Answer 1 · 2018-01-16

1

Entering edit mode

Michael Love 41k

@mikelove

Last seen 9 hours ago

United States

You can compare rlog and CPM values across samples as proportional to expression (especially if you used the tximport pipeline, which controls for differential gene length due to alternative isoform usage when working with gene counts). rlog difference of 2 is roughly a fold change of 4, rlog difference of 1 is a 2 fold change.

ADD COMMENT • link 6.3 years ago Michael Love 41k

0

Entering edit mode

Thank you for your answer.

I'm also wondering whether it's fair to compare the abundance of mRNA between two genes.

It does not make a whole lot of sense but I want to achieve something like:

If a gene has CPM=5000 (or rlog=11) and another gene has CPM=2 (or rlog=0), I might favor the former as a candidate gene, for example, to generate mouse tools, etc.

ADD REPLY • link 6.3 years ago Yang_Zheng_Neuro • 0

0

Entering edit mode

No rlog and CPM are not proportional to expression when comparing across genes. Longer genes will have higher rlog and CPM. For comparing across samples and genes, you would want a measure like TPM. This is the "abundance" matrix that is imported by tximport from transcript quantification methods.

ADD REPLY • link 6.3 years ago Michael Love 41k