Question

Limma outputs different results for same sample?

0

Entering edit mode

snijesh ▴ 20

@snijesh-20358

Last seen 16 days ago

India

In the limma user manual, page 41-43 it provides two ways of calculation of DEGs

Approach1

#Differentially expressed genes can be found by
> fit <- lmFit(eset, design)
> fit <- eBayes(fit)
> topTable(fit, coef="Grp1vsGrp2", adjust="BH")

Approach2

#Differentially expressed genes can be found by
> fit <- lmFit(eset, design)
> cont.matrix <- makeContrasts(Grp1vsGrp2=Grp1 - Grp2, levels=design)
> fit2 <- contrasts.fit(fit, cont.matrix)
> fit2 <- eBayes(fit2)
> topTable(fit2, adjust="BH")

But I get different output for these methods:

Approcah1: output

ID     logFC   AveExpr         t      P.Value    adj.P.Val          B
1   GE_BrightCorner 14.034775 14.142391 48.329231 1.062840e-06 5.771813e-06  5.9423764
2        DarkCorner  3.526284  3.681083 11.510530 3.202467e-04 4.538190e-04  1.0973086
3        DarkCorner  4.438930  4.005311 24.177036 1.695687e-05 4.254662e-05  4.0948249
4      A_23_P117082 12.930345 12.460447 67.487497 2.788662e-07 4.420224e-06  6.4346613
5     A_33_P3246448  4.475369  4.365248 24.873738 1.514011e-05 3.896822e-05  4.1925378
6     A_33_P3318220  4.395836  4.075980 22.628496 2.207786e-05 5.232768e-05  3.8608390
7     A_33_P3236322  4.103927  4.176582 23.902379 1.774740e-05 4.407702e-05  4.0550613

Approach 2: output

           ID       logFC   AveExpr           t     P.Value adj.P.Val         B
 1   GE_BrightCorner -0.21523164 14.142391 -0.52407752 0.627849111 0.8757925 -6.501968
 2        DarkCorner -0.30959707  3.681083 -0.71459481 0.514243534 0.8245442 -6.370012
 3        DarkCorner  0.86723802  4.005311  3.34001263 0.028713468 0.3393645 -3.441232
 4      A_23_P117082  0.93979506 12.460447  3.46841753 0.025505731 0.3271261 -3.309127
 5     A_33_P3246448  0.22024196  4.365248  0.86556019 0.435430560 0.7847096 -6.243565
 6     A_33_P3318220  0.63971250  4.075980  2.32854199 0.080197184 0.4662019 -4.575855

Expect AveExpr all other parameters are different

sessionInfo()

R version 3.6.2 (2019-12-12)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 19041)

Matrix products: default

locale:
[1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United States.1252   
[3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C                          
[5] LC_TIME=English_United States.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] limma_3.42.2

loaded via a namespace (and not attached):
[1] compiler_3.6.2 tools_3.6.2    tinytex_0.27   xfun_0.19     
>

topTable MicroarrayData ebayes limma Microarray • 1.2k views

ADD COMMENT • link 3.2 years ago snijesh ▴ 20

0

Entering edit mode

Cross-posted: https://www.biostars.org/p/490523/

ADD REPLY • link 3.2 years ago Kevin Blighe ★ 3.9k

1

Entering edit mode

Gordon Smyth 50k

@gordon-smyth

Last seen 3 hours ago

WEHI, Melbourne, Australia

The limma User's Guide shows how to get identical results using appropriate contrasts from two different design matrices. You seem to have instead applied the different contrasts to the same design matrix. As James suggests, your Approach 1 output is obviously wrong.

BTW the code you show couldn't run as given. There must be hidden code not shown.

ADD COMMENT • link 3.2 years ago Gordon Smyth 50k

0

Entering edit mode

Ok got it

ADD REPLY • link 3.2 years ago snijesh ▴ 20

score 5 · Accepted Answer · 2021-02-12

The critical thing missing here is your design matrix. But looking at your results for the first comparison, it appears that you have a cell means model (that you set up using model.matrix(~0 + <something>)), in which case your 'Grp1vsGrp2' coefficient isn't a comparison, but instead is the mean expression for one of your groups (e.g., you need a contrasts matrix as well).

Another reason to suspect that these are just the mean intensity values instead of actual logFC values is that the top 'gene' there is a BrightCorner spot, which is a positive control spot that Agilent puts at the corner of the arrays so the scanner can be aligned. The scanner generates a 16-bit TIFF file, so the maximum possible value for a spot is 2^16 (or 16 after taking logs). With saturation it usually ends up being somewhere around 14, which is about what I would expect for the average of a BrightCorner positive control.

So, two things. First, you should remove all the probes that are controls of any type, as you don't want controls in your topTable results, and you are increasing your multiplicity burden unnecessarily. And second, if you fit a cell means model (which I tend to prefer, all things equal), you always have to make explicit contrasts because all you are doing at first is estimating the mean expression for each group, which is uninteresting on its own.