limma voom: summary(decideTests(fit))
2
0
Entering edit mode
scheran02 • 0
@scheran02-20443
Last seen 3.6 years ago

Hi, I am running an RNAseq analysis with limma and voom. I am puzzled by the summary(decideTests(fit)) function. I am not sure what exactly it shows. According to my (limited) understanding, running my analysis should give 153 (104+49) genes with adjusted p-value<0.05 when looking at the summary(results(fit)) for coefficient 2 (GroupNonprogr):

        (Intercept) GroupNonprogr pairsB
-1           2           104                44    
0         9569         18789         18751  
1         9371           49              147

However, none of the genes meet this cutoff, when I extract coef 2. It is not an error, but I have problem understanding what the summary function actually shows, and how I can extract those values in an xls or txt file. So far I use:

tab <- topTable(fit, coef=2, n=Inf, sort.by='none', genelist=fData.short)
ix <- sort(tab$logFC, index.return=TRUE, decreasing=TRUE)$ix
tab.sorted <- tab[ix,]
write.table(tab.sorted, file="filename.txt", row.names=TRUE, quote=FALSE, sep="\t")

I would be very grateful if someone spent a few minutes to help me out here.

Andreas

limma • 1.4k views
ADD COMMENT
0
Entering edit mode

Hi Andreas,

Have you figured it out? I have the same problem.

I run

DEresults = decideTests(data.fit.eb,method='global',adjust.method="BH",p.value=0.05,lfc=1)
summary(DEresults)

for one of the coefficients i got:

  -1     0     1 
   15 54649    11 

Then I concluded that 26 genes are differentially expressed. But looking in the DEresults table, none of the genes meet that criteria. None of the genes have a p.adj <0.05 I have 127 genes with absolute logFC > 1 Then 70 genes with absolute logFC > 1 and pval <0.05 but 0 genes with absolute logFC > 1 and adj pval <0.05

How does summary gets to 26 genes?

Hope you can help me. Suz

ADD REPLY
0
Entering edit mode

Dear Suz,

If you want to ask a question, please post of question of your own rather than commenting on an unrelated question from a year ago.

I am guessing that you might be confused by "global" FDR adjustment. The limma documentation explains the difference between "global" and "separate" FDR adjustment.

ADD REPLY
3
Entering edit mode
@james-w-macdonald-5106
Last seen 1 day ago
United States

I think there must be an error in your code somewhere. But it's hard to say, since you just give us some snippets, and certainly are not giving us a reproducible, self contained example. So I'll give you one.

> set.seed(0xabeef)
> mat <- matrix(rnorm(10000), 1000)
> mat[1:20,1:5] <- mat[1:20,1:5] + 2
> design <- model.matrix(~gl(2,5))
> fit <- lmFit(mat, design)
> fit2 <- eBayes(fit)
> topTable(fit2,2,Inf,p.value = 0.05)
   logFC AveExpr     t  P.Value adj.P.Val     B
10 -2.91   0.853 -4.61 4.67e-06   0.00467 3.842
1  -2.75   1.176 -4.37 1.44e-05   0.00721 2.838
19 -2.63   1.051 -4.17 3.40e-05   0.01135 2.075
4  -2.57   0.974 -4.09 4.86e-05   0.01216 1.759
6  -2.53   1.188 -4.01 6.66e-05   0.01332 1.482
9  -2.39   1.518 -3.80 1.57e-04   0.02622 0.725
> rslt <- decideTests(fit2)

> head(rslt)
     (Intercept) gl(2, 5)2
[1,]           1        -1
[2,]           1         0
[3,]           1         0
[4,]           1        -1
[5,]           1         0
[6,]           1        -1
> which(rslt[,2] != 0L)
[1]  1  4  6  9 10 19

Which you can see are the same rows as from topTable. And anyway, there's this from ?decideTests

     The setting 'method="separate"' is equivalent to using 'topTable'
     separately for each coefficient in the linear model fit, and will
     identify the same probes as significantly differentially expressed
     if 'adjust.method' is the same. 'method="global"' will treat the
     entire matrix of t-statistics as a single vector of unrelated
     tests. 'method="hierarchical"' adjusts down genes and then across
     contrasts. 'method="nestedF"' adjusts down genes and then uses
     'classifyTestsF' to classify contrasts as significant or not for
     the selected genes. Please see the limma User's Guide for a
     discussion of the statistical properties of these methods.

The limma package has been around since the early 2000's, and if decideTests and topTable didn't give the same results, someone would have pointed that out long ago. So like I said, there's probably an error in your code somewhere (as an example, you say I am puzzled by the summary(decideTests(fit)), but then in the same paragraph you have this: when looking at the summary(results(fit)), which is different code).

ADD COMMENT
0
Entering edit mode

Thank you for all this, James. And of course I did not mean to say that there is an error in limma, which is why I wrote "It is not an error, but I have problem understanding what the summary function actually shows". My own problem.

ADD REPLY
0
Entering edit mode

'method="hierarchical"' adjusts down genes and then across contrasts how do I extract specific contrasts? I have in my design multiple genetics as well as cytogenetics data such as NPM1,DNMT3A mutation etc etc as well translocation inversion etc.

ADD REPLY
1
Entering edit mode
@gordon-smyth
Last seen 3 hours ago
WEHI, Melbourne, Australia

Andreas, I find it hard to figure out what you are trying to do or what help you are seeking.

You say that you don't know what summary(decideTests(fit)) does, but you seem to understand well enough. The output tells you there are 153 DE genes (49 up and 104 down) in the Nonprogr group.

You say that you want to write results to a file, but you've obviously already written a tab-delimited text file. Why is that not what you want? What is it that you do want?

You clam that actually you have no DE genes for coef 2, but that just isn't true. How have you come that conclusion anyway? The code you show doesn't even look at DE genes.

You say you want to "extract those values", but what values exactly are you refering to? Do you want to see the DE genes? Why not just use

topTable(fit, coef=2, n=153)

Why are you sorting genes by fold change if you actually want genes with small p-values? If you did want to sort genes by fold-change, why didn't you use sort.by="logFC"? It is all quite puzzling.

You've framed your question about summary(decideTests()) but I am guessing the issue is more about how to extract DE genes.

Finally, please consider upgrading to the current version of Bioconductor and limma. The version you are using is from 2017 or earlier. If you used a recent version of limma, then the row names from summary(decideTests()) would be more informative.

ADD COMMENT

Login before adding your answer.

Traffic: 381 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6