suggest to report differential estimate in edgeR
1
0
Entering edit mode
Yongqing • 0
@b01bffdc
Last seen 15 months ago
Hong Kong

Hi,

I am writing a post because I encountered an interesting question when using edgeR. I want to analyze RNA-seq data gained from samples of two groups: control and treatment (inhibitor of a kinase). I want to perform differential expression analysis and find out what the main role of this kinase is. I know many people would do it based on fold change. However, I noticed that some genes have very small expression counts (for example, 30-50 counts), though they have a large fold change. I am more interested in the differential counts of genes than the fold change because some genes that have large differential counts (for example, 30000 counts) might have a huge influence on the cell even without a large fold change. Also, in cells, living actually means a lot of chemical reactions going on. Suppose we have a molecule with a very large number in a cell, its number might be prevented from continuing to grow because of the regulatory networks within the cell - or if its number dropped, say four-fold in the cell, the cell would have died, which would make it harder to achieve as large a fold change as a molecule with a very small expression level, but this molecule is important. I know edgeR needs data to be normalized (TMM) first. However, if there is a fold change, this software has estimated the mean of the control group, the mean of the treatment group, and the differential expression number. With these numbers, I can further calculate the number I want for my analysis. If the software can report a fold change, it has estimated the mean of the control group, the mean of the treatment group, and the differential expression number. So I am writing to ask about the possibility of reporting these numbers in edgeR. I believe it would help a lot of downstream lab work and benefit future biomedical discoveries.

Many thanks,

Yongqing

edgeR • 591 views
ADD COMMENT
0
Entering edit mode
@gordon-smyth
Last seen 21 minutes ago
WEHI, Melbourne, Australia

edgeR already reports results in an appropriate way. You can already get the average CPM or average logCPM per group by using the cpmByGroup function.

You seem to assume that logFC are always computed as the difference between group means, but that is not correct. For many linear models (paired comparisons is one common example) there are no group means.

edgeR already solves the problem that you identify, that large fold-changes are often associated with small counts, in a sophisticated and principled manner. The DE list that edgeR presents already balances count sizes and fold-changes in an appropriate way, requiring larger fold-changes from genes with small counts before they are assessed as significantly DE. edgeR already compares counts rather than relying on the size of the fold-change. That's why we have always strongly recommended that DE be assessed by p-values rather than by fold-change.

ADD COMMENT
0
Entering edit mode

Hi Gordon, thank you very much for your reply.

ADD REPLY

Login before adding your answer.

Traffic: 926 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6