Calculation of logCPM in edgeR
1
1
Entering edit mode
H ▴ 20
@H-24669
Last seen 3.7 years ago

I am so new in Bioinformatics, using R and edgeR. I used the following code:

targets <- read.delim("cell_line_M.txt", stringsAsFactors = FALSE)

d <- readDGE(targets)
colnames(d) <- c("MG1","MG2", "MN1","MN2")
d <- estimateCommonDisp(d, verbose=TRUE)
d <- estimateTagwiseDisp(d, trend="none")

et <- exactTest(d,pair=c("MN","MG"))
print(et)

this is the first line of what I obtained :

          logFC   logCPM       PValue
A1CF  0.20103589 4.718215 0.8603790511

my problem is how the logCPM is calculated. the main file is

gene   MG1  MG2  MN1 MN2
A1CF    8   7    7   4

considering library size of all columns is 450000, the related cpm according to the formula:

CPM=count*1e6/(library size of that group)

will be

gene   MG1   MG2   MN1  MN2
A1CF  17.7  15.5  15.5  8.8

I expect that the logCPM should be calculated like what logFC is calculated from the strategy suggested in here like:

geo_mean_CMP=sqrt(17.7 15.5)/ sqrt(15.5 8.8) --> logCMP=log2(geo_mean_CMP)

is the strategy true or it is totally different?

logCPM edgeR • 1.6k views
ADD COMMENT
0
Entering edit mode

Linking to duplicate post: Calculation of logFC in edgeR

ADD REPLY
1
Entering edit mode
@gordon-smyth
Last seen 2 hours ago
WEHI, Melbourne, Australia

The edgeR User's Guide explains extensively that edgeR uses negative binomial generalized linear models, so the simple calculations you give could not be correct.

The logCPM values in the topTags table are computed by aveLogCPM. See ?aveLogCPM.

ADD COMMENT

Login before adding your answer.

Traffic: 448 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6