how to calculate the logFC value
2
0
Entering edit mode
yueli7 ▴ 10
@yueli7-8401
Last seen 10 weeks ago
China

Hello,

I used edgeR.

After calcNormFactors, I got the norm.factors, then I tried to use the counts number and norm.factors to calculate the logFC.

1. (478/1.0296636+619/1.0372521+628/1.0362662+744/1.0378383)/4=596

(483/0.9537095+716/0.9525624+240/0.9583181)/3=503

log2(503/596)=-0.244754

2. (478*1.0296636+619*1.0372521+628*1.0362662+744*1.0378383)/4=639

(483*0.9537095+716*0.9525624+240*0.9583181)/3=458

log2(458/639)=-0.480468

None of them seems to =-0.4315571.

My question is maybe calculate the logFC is more complicate?

Is that possible I can calculate it by myself?

> y
An object of class "DGEList"
$counts Con1 Con2 Con3 Con4 DHT1 DHT2 DHT3 ENSG00000124208 478 619 628 744 483 716 240 ENSG00000182463 27 20 27 26 48 55 24 ENSG00000124201 180 218 293 275 373 301 88 ENSG00000124207 76 80 85 97 80 81 37 ENSG00000125835 132 200 200 228 280 204 52 16489 more rows ...$samples
group lib.size norm.factors
Con1 Control   976847    1.0296636
Con2 Control  1154746    1.0372521
Con3 Control  1439393    1.0362662
Con4 Control  1482652    1.0378383
DHT1     DHT  1820628    0.9537095
DHT2     DHT  1831553    0.9525624
DHT3     DHT   680798    0.9583181

> y <- estimateCommonDisp(y, verbose=TRUE)
Disp = 0.02002 , BCV = 0.1415
> y <- estimateTagwiseDisp(y);
> plotBCV(y);
> et <- exactTest(y);

> a<-et\$table
logFC   logCPM      PValue
ENSG00000124208 -0.4315571 8.725537 0.002741666
ENSG00000182463  0.6909960 4.691815 0.007143860
ENSG00000124201 -0.0540919 7.511315 0.667158340
ENSG00000124207 -0.4226992 5.900093 0.026677865
ENSG00000125835 -0.2470948 7.098438 0.176266270
ENSG00000125834  0.3098867 5.803637 0.152476022

edger • 788 views
2
Entering edit mode
Aaron Lun ★ 26k
@alun
Last seen 5 minutes ago
The city by the bay

You should be using the quasi-likelihood framework (estimateDisp, glmQLFit and glmQLFTest), which offers a number of advantages over the classic and LRT methods. But long story short, yes, the calculation of the log-fold change is more complicated than taking group-wise averages and comparing them.

Check out ?predFC for a brief summary and the associated reference for precise mathematical details. The calculation in exactTest is slightly different due to the differences between the classic and GLM-based methods, but then again, you should be using the GLM-based methods anyway.

2
Entering edit mode
@gordon-smyth
Last seen 52 minutes ago
WEHI, Melbourne, Australia

edgeR computes logFC values using negative binomial generalized linear models. There are some differences between the classic (exactTest) and GLM (glmFit) pipelines, but they all use generalized linear models.

You could theoreticaly reproduce edgeR's calculation if you are familiar with generalized linear models, but the computation is indeed much more complicated (and much better) than what you have done so far.