Calculation of logCPM in edgeR
Entering edit mode
H • 0
Last seen 51 minutes ago

I am so new in Bioinformatics, using R and edgeR. I used the following code:

targets <- read.delim("cell_line_M.txt", stringsAsFactors = FALSE)

d <- readDGE(targets)
colnames(d) <- c("MG1","MG2", "MN1","MN2")
d <- estimateCommonDisp(d, verbose=TRUE)
d <- estimateTagwiseDisp(d, trend="none")

et <- exactTest(d,pair=c("MN","MG"))

this is the first line of what I obtained :

          logFC   logCPM       PValue
A1CF  0.20103589 4.718215 0.8603790511

my problem is how the logCPM is calculated. the main file is

gene   MG1  MG2  MN1 MN2
A1CF    8   7    7   4

considering library size of all columns is 450000, the related cpm according to the formula:

CPM=count*1e6/(library size of that group)

will be

gene   MG1   MG2   MN1  MN2
A1CF  17.7  15.5  15.5  8.8

I expect that the logCPM should be calculated like what logFC is calculated from the strategy suggested in here like:

geo_mean_CMP=sqrt(17.7 15.5)/ sqrt(15.5 8.8) --> logCMP=log2(geo_mean_CMP)

is the strategy true or it is totally different?

logCPM edgeR • 96 views
Entering edit mode

Linking to duplicate post: Calculation of logFC in edgeR

Entering edit mode
Last seen 46 minutes ago
WEHI, Melbourne, Australia

The edgeR User's Guide explains extensively that edgeR uses negative binomial generalized linear models, so the simple calculations you give could not be correct.

The logCPM values in the topTags table are computed by aveLogCPM. See ?aveLogCPM.


Login before adding your answer.

Similar Posts
Loading Similar Posts
Traffic: 215 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.4