Discrepancy in the output of decideTests (total not equal to sum of up/down genes)
2
0
Entering edit mode
@ahmedasadik-11805
Last seen 5.9 years ago
Germany, DKFZ

Dear All,

I am having some strange experience with the function decideTests. When I compare between the total number of genes between my three groups and the the sum of the corresponding up and down genes the results are different.

The sum is either larger than or smaller than the total.

Am I doing something wrong, or is there an explanation for this discrepancy?

Below is code and output for two different comparisons using the same fit (two channel array with common reference) and examples highlighted in yellow:

d.lm <- modelMatrix(targets, ref = "reference")
fit.94.g <- lmFit(data94.g, design = d.lm, weights = arrayWeights(data94.g, design = d.lm))
fit.94.g_e <- eBayes(fit.94.g)

data94_results <- decideTests(fit.94.g_e, method="separate", adjust.method="fdr", p.value=0.05)

> vennCounts(data94_results)
  MES RTK1 RTK2 Counts
1   0    0    0   4107
2   0    0    1   2626
3   0    1    0    158
4   0    1    1    176
5   1    0    0    506
6   1    0    1   4520
7   1    1    0     79
8   1    1    1   8180
attr(,"class")
[1] "VennCounts"

> vennCounts(data94_results, "up")
  MES RTK1 RTK2 Counts
1   0    0    0  11428
2   0    0    1   1585
3   0    1    0     48
4   0    1    1     88
5   1    0    0    334
6   1    0    1   2871
7   1    1    0     23
8   1    1    1   3975
attr(,"class")
[1] "VennCounts"

> vennCounts(data94_results, "down")
  MES RTK1 RTK2 Counts
1   0    0    0  13024
2   0    0    1   1044
3   0    1    0    117
4   0    1    1     85
5   1    0    0    176
6   1    0    1   1649
7   1    1    0     52
8   1    1    1   4205
attr(,"class")
[1] "VennCounts"


> data94_results1 <- decideTests(fit.94.g_e, method="global", adjust.method="fdr", p.value=0.05)

> vennCounts(data94_results1)
  MES RTK1 RTK2 Counts
1   0    0    0   4247
2   0    0    1   2504
3   0    1    0    191
4   0    1    1    209
5   1    0    0    510
6   1    0    1   3975
7   1    1    0     92
8   1    1    1   8624
attr(,"class")
[1] "VennCounts"

> vennCounts(data94_results1, "up")
  MES RTK1 RTK2 Counts
1   0    0    0  11529
2   0    0    1   1514
3   0    1    0     58
4   0    1    1    108
5   1    0    0    347
6   1    0    1   2564
7   1    1    0     28
8   1    1    1   4204
attr(,"class")
[1] "VennCounts"

> vennCounts(data94_results1, "down")
  MES RTK1 RTK2 Counts
1   0    0    0  13062
2   0    0    1    993
3   0    1    0    141
4   0    1    1     98
5   1    0    0    168
6   1    0    1   1411
7   1    1    0     59
8   1    1    1   4420
attr(,"class")
[1] "VennCounts"

I would appreciate your help on this, because I am stuck and have been scratching my head for hours now.

Thanks,

Ahmed

limma decidetests • 1.1k views
ADD COMMENT
3
Entering edit mode
Aaron Lun ★ 28k
@alun
Last seen 6 hours ago
The city by the bay

With include="up", downregulated genes are ignored, i.e., they might as well be non-DE genes. So, if you have a gene that is upregulated in one contrast but downregulated in the two other contrasts, it would be:

> vennCounts(cbind(1,-1,-1), include="both")
  Group 1 Group 2 Group 3 Counts
1       0       0       0      0
2       0       0       1      0
3       0       1       0      0
4       0       1       1      0
5       1       0       0      0
6       1       0       1      0
7       1       1       0      0
8       1       1       1      1
attr(,"class")
[1] "VennCounts"

... without considering the direction of the change, but this would switch to:

> vennCounts(cbind(1,-1,-1), include="up")
  Group 1 Group 2 Group 3 Counts
1       0       0       0      0
2       0       0       1      0
3       0       1       0      0
4       0       1       1      0
5       1       0       0      1
6       1       0       1      0
7       1       1       0      0
8       1       1       1      0
attr(,"class")
[1] "VennCounts"

... when you only treat upregulated genes as being of interest. In short, the genes will move around the different categories depending on whether you consider genes changing in one direction or the other, or both. Thus, the numbers needn't add up per category.

ADD COMMENT
0
Entering edit mode
@gordon-smyth
Last seen 5 hours ago
WEHI, Melbourne, Australia

If you want to examine the number of DE genes from decideTests(), then use

summary(data94_results)
summary(data94_results1)
ADD COMMENT
0
Entering edit mode

Dear Gordon and Aaron,

Thank you very much for your replies. This seems to solve the issue, but I have another question. Which genes should I use to run GSEA. Should I use the genes after the option include="both", or include=c("up", "down")? because the total number of genes from both options are slightly different as you have explained.

Best,

Ahmed

ADD REPLY
0
Entering edit mode

If you have a new question, then please post it as a new question, rather than adding a comment to your previous question. It would also be good to explain more about what you are trying to do, because it isn't obvious to me how making a DE list is connected to running GSEA.

ADD REPLY

Login before adding your answer.

Traffic: 520 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6