Search
Question: Limma voom or trend?
0
gravatar for b.nota
20 months ago by
b.nota300
Netherlands
b.nota300 wrote:

Hello,

This might have been asked before, but I couldn't find the answer. So I try to ask here what the difference between voom and trend is for RNAseq analysis.

I have a study with four groups, and a batch effect. The lib sizes are between 8M to 11M reads. In the user's guide it says that when the lib sizes are not more than 3 fold different, trend is the method to choose. So when I do this I get significant genes. However, when I do the analysis with voom, I get many more significantly expressed genes. Why is that, and how reliable are these extra genes? Does this mean that the extra genes from limma voom are false positive?

trend

> DGE1 <- DGEList(counts, group=Group)
> keep <- rowSums(cpm(DGE1)>1) >= 3
> DGE2 <- DGE1[keep,]
> y1 <- calcNormFactors(DGE2)
> logCPM <- cpm(y1, log=TRUE, prior.count=0.5)
> design <- model.matrix(~0 + targets$Group + targets$Donor)
> fit <- lmFit(logCPM, design)
> fit2 <- eBayes(contrasts.fit(fit, contr))
> summary(decideTests(fit2, method="separate"))
   PPvsPN PPvsNN PPvsNP PNvsNN PNvsNP NPvsNN
-1     54    140     86     27    433    137
0   15897  15654  15796  16039  15095  15803
1     154    311    223     39    577    165

voom

> DGE1 <- DGEList(counts_minus, group=Group)
> keep <- rowSums(cpm(DGE1)>1) >= 3
> DGE2 <- DGE1[keep,]
> y1 <- calcNormFactors(DGE2)
> design <- model.matrix(~0 + targets$Group + targets$Donor)
> v <- voom(y1, design, plot=TRUE)
> fit <- lmFit(v, design)
> fit2 <- eBayes(contrasts.fit(fit, contr))
> summary(decideTests(fit2, method="separate"))
   PPvsPN PPvsNN PPvsNP PNvsNN PNvsNP NPvsNN
-1    363    579    428    277    873    465
0   15166  14647  15023  15469  14252  15087
1     576    879    654    359    980    553


 

 

 

 

 

ADD COMMENTlink modified 7 months ago by max.moldovan0 • written 20 months ago by b.nota300
2
gravatar for James W. MacDonald
20 months ago by
United States
James W. MacDonald47k wrote:

The code for limma trend isn't doing limma trend. You have to include trend = TRUE in the call to eBayes, or else you are doing 'regular' limma. This probably has a lot to do with the difference in numbers of significant genes.

In addition, if I do a google search like 'limma trend vs voom site:support.bioconductor.org', the first Limma-voom vs limma-trend seems relevant.

ADD COMMENTlink written 20 months ago by James W. MacDonald47k

Well, that's embarrassing... Forgetting the "trend=T" part in the limma trend analysis! Must have something to do with Mondays...

Thanks for your quick help, the figures are more alike now with the real trend included:

> fit2 <- eBayes(contrasts.fit(fit, contr),trend=TRUE)

> summary(decideTests(fit2, method="separate"))
   PPvsPN PPvsNN PPvsNP PNvsNN PNvsNP NPvsNN
-1    257    428    303    193    796    372
0   15339  14793  15212  15544  14403  15250
1     509    884    590    368    906    483

 

ADD REPLYlink written 20 months ago by b.nota300

As you've discovered, voom and limma-trend tend to give about the same number of DE genes, although it will vary from one dataset to another.

ADD REPLYlink written 20 months ago by Gordon Smyth35k
0
gravatar for max.moldovan
7 months ago by
max.moldovan0 wrote:

I was googling for this exact question and this what here is an excellent numerical illustration. What is the difference methodologically? It is not clear from the manual etc. Isn't it that by not applying voom we don't account for the common mean-variance dependance issue?

ADD COMMENTlink written 7 months ago by max.moldovan0
1

First, unless you are answering a question, don't use the 'Add your answer' box. Because you aren't answering a question. Use the ADD REPLY link to add replies or additional questions.

I would disagree with your statement regarding clarity of the 'manual', although I am not sure what you mean by that. There are help pages, and they seem abundantly clear to me. If you read ?voom, you will see this:

    'voom' is an acronym for mean-variance modelling at the
     observational level. The key concern is to estimate the
     mean-variance relationship in the data, then use this to compute
     appropriate weights for each observation. Count data almost show
     non-trivial mean-variance relationships. Raw counts show
     increasing variance with increasing count size, while log-counts
     typically show a decreasing mean-variance trend. This function
     estimates the mean-variance trend for log-counts, then assigns a
     weight to each observation based on its predicted variance. The
     weights are then used in the linear modelling process to adjust
     for heteroscedasticity.

     'voom' performs the following specific calculations. First, the
     counts are converted to logCPM values, adding 0.5 to all the
     counts to avoid taking the logarithm of zero. The matrix of logCPM
     values is then optionally normalized. The 'lmFit' function is used
     to fit row-wise linear models. The 'lowess' function is then used
     to fit a trend to the square-root-standard-deviations as a
     function of average logCPM. The trend line is then used to predict
     the variance of each logCPM value as a function of its fitted
     value, and the inverse variances become the estimated precision
     weights.

And if you look at ?eBayes, you will see

     If 'trend=TRUE' then an intensity-dependent trend is fitted to the
     prior variances 's2.prior'. Specifically, 'squeezeVar' is called
     with the 'covariate' equal to 'Amean', the average log2-intensity
     for each gene. See 'squeezeVar' for more details.

There are also multiple references in each man page that you can reference if you really want to get into the weeds.

ADD REPLYlink written 7 months ago by James W. MacDonald47k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.2.0
Traffic: 181 users visited in the last hour