Calculation of logCPM in edgeR
1
0
Entering edit mode
H • 0
@H-24669
Last seen 21 months ago

I am so new in Bioinformatics, using R and edgeR. I used the following code:

targets <- read.delim("cell_line_M.txt", stringsAsFactors = FALSE)

colnames(d) <- c("MG1","MG2", "MN1","MN2")
d <- estimateCommonDisp(d, verbose=TRUE)
d <- estimateTagwiseDisp(d, trend="none")

et <- exactTest(d,pair=c("MN","MG"))
print(et)


this is the first line of what I obtained :

          logFC   logCPM       PValue
A1CF  0.20103589 4.718215 0.8603790511


my problem is how the logCPM is calculated. the main file is

gene   MG1  MG2  MN1 MN2
A1CF    8   7    7   4


considering library size of all columns is 450000, the related cpm according to the formula:

CPM=count*1e6/(library size of that group)

will be

gene   MG1   MG2   MN1  MN2
A1CF  17.7  15.5  15.5  8.8


I expect that the logCPM should be calculated like what logFC is calculated from the strategy suggested in here like:

geo_mean_CMP=sqrt(17.7 15.5)/ sqrt(15.5 8.8) --> logCMP=log2(geo_mean_CMP)

is the strategy true or it is totally different?

logCPM edgeR • 511 views
0
Entering edit mode

Linking to duplicate post: Calculation of logFC in edgeR

1
Entering edit mode
@gordon-smyth
Last seen 58 minutes ago
WEHI, Melbourne, Australia

The edgeR User's Guide explains extensively that edgeR uses negative binomial generalized linear models, so the simple calculations you give could not be correct.

The logCPM values in the topTags table are computed by aveLogCPM. See ?aveLogCPM.