I am so new in Bioinformatics, using R and edgeR. I used the following code:
targets <- read.delim("cell_line_M.txt", stringsAsFactors = FALSE)
d <- readDGE(targets)
colnames(d) <- c("MG1","MG2", "MN1","MN2")
d <- estimateCommonDisp(d, verbose=TRUE)
d <- estimateTagwiseDisp(d, trend="none")
et <- exactTest(d,pair=c("MN","MG"))
print(et)
this is the first line of what I obtained :
logFC logCPM PValue
A1CF 0.20103589 4.718215 0.8603790511
my problem is how the logCPM is calculated. the main file is
gene MG1 MG2 MN1 MN2
A1CF 8 7 7 4
considering library size of all columns is 450000, the related cpm according to the formula:
CPM=count*1e6/(library size of that group)
will be
gene MG1 MG2 MN1 MN2
A1CF 17.7 15.5 15.5 8.8
I expect that the logCPM should be calculated like what logFC is calculated from the strategy suggested in here like:
geo_mean_CMP=sqrt(17.7 15.5)/ sqrt(15.5 8.8) --> logCMP=log2(geo_mean_CMP)
is the strategy true or it is totally different?
Linking to duplicate post: Calculation of logFC in edgeR