Question: Limma example where one comparison gives different results depending on the matrix formulation
gravatar for alexgraehl
5 weeks ago by
United States
alexgraehl20 wrote:

Hi fellow Limma fans,

 I can't imagine that this is actually a bug, but is is a REALLY odd situation in which one contrast give me different results depending on how it is formulated.

 If you (like me) have been assuming that it doesn't matter if you make your limma design matrix with an explicit contrast or an implicit one, well, check again—you may get different results!

 (Note that these are NOT due to numeric under/overflow.)

 See if you can solve the mystery... I couldn't!

Below is a minimal R test case that relies on no external data and generates a scatterplot.



require(limma); require(edgeR) # edgeR is for DGEList

counts <- data.frame(matrix(as.integer(10*rexp(9*50)), ncol=9))
colnames(counts) <- c("CTRL1", "CTRL2", "CTRL3", "DrugX_1", "DrugX_2", "DrugX_3", "DrugY_1", "DrugY_2", "DrugY_3")
rownames(counts) <- paste0("Gene_", 1:nrow(counts))

counts[5:15 , grepl("DrugX", colnames(counts))] = 4 + counts[5:15 , grepl("DrugX", colnames(counts))] * 5 # fake diff expressed genes
counts[30:40 , grepl("DrugY", colnames(counts))] = 2 + counts[30:40 , grepl("DrugX", colnames(counts))] * 12 # fake diff expressed genes
counts[13:25 , grepl("Drug.*", colnames(counts))] = 3 + counts[13:25 , grepl("Drug.*", colnames(counts))] * 8 # some more fake stuff

heatmap(as.matrix(log2(1+counts)), scale="none") # show the data

metadata=data.frame("SAMPLE_ID"=colnames(counts), "GROUP"=c(rep("CTRL",3), rep("DrugX",3), rep("DrugY",3)))

pvals = list(); logFC = list()

    design.mat = model.matrix(~ GROUP, metadata)
    colnames(design.mat) <- c("INTERCEPT","DrugX", "DrugY")
    contrast.mat         <- limma::makeContrasts(DrugX  , DrugY , DrugY-DrugX, levels=design.mat) # don't subtract the intercept!
    comparisons = c("DrugX", "DrugY", "DrugY - DrugX")
  } else if (STYLE=="NO_INTERCEPT") {
    design.mat = model.matrix(~ 0 + GROUP, metadata)
    colnames(design.mat) <- gsub("^GROUP", "", colnames(design.mat))
    contrast.mat         <- limma::makeContrasts(DrugX-CTRL, DrugY-CTRL, DrugY-DrugX, levels=design.mat)
    comparisons = c("DrugX - CTRL", "DrugY - CTRL", "DrugY - DrugX")
  } else {
    stop("unrecognized 'STYLE'")
  dge = DGEList(counts=counts, group=as.factor(metadata$GROUP))
  dubious_comparison = comparisons[3] # the drugY - drugX one   
  print(paste0("Running the following comparison: ", dubious_comparison))
  voomObj = voom(dge, design=design.mat, lib.size=dge$samples$lib.size, plot=TRUE)
  fit     = limma::lmFit(object=voomObj, design=voomObj$design)
  fit     =, contrast.mat) # or remove this line to go back to the previous method
  fit     = limma::eBayes(fit)
  toptab  = limma::topTable(fit=fit, coef=dubious_comparison, number=Inf,'p', confint=TRUE)
  pvals[[STYLE]] = toptab$P.Value[order(rownames(toptab))]
  logFC[[STYLE]] = toptab$logFC[order(rownames(toptab))]
  print(paste0("If we use the style where our design matrix is set up with: ", STYLE, ", then our results are (note the P.Value):"))

par(mfrow=c(1,2)) # 2 plots
par(pty='s') # square
plot(logFC[["NO_INTERCEPT"]], logFC[["WITH_INTERCEPT"]], col="blue", pch=22, main="logFC comparison (always perfectly on the line)")
abline(a=0, b=1, col="#0000FF55")

par(pty='s') # square
plot(pvals[["NO_INTERCEPT"]], pvals[["WITH_INTERCEPT"]], col="red", pch=19, main="P-value comparison (NOT on the line!)")
abline(a=0, b=1, col="#0000FF55")


Below: the image compares the two methods. Fold change is the same, but (for some reason) the P-values are different, so I assume that the computed variance being different is the culprit.

P-values on the right are expected to be on the line, but they are not!



ADD COMMENTlink modified 5 weeks ago by Gordon Smyth35k • written 5 weeks ago by alexgraehl20
gravatar for Aaron Lun
5 weeks ago by
Aaron Lun21k
Cambridge, United Kingdom
Aaron Lun21k wrote:

This is probably due to the (known) behaviour described in ?

     Warning. For efficiency reasons, this function does not
     re-factorize the design matrix for each probe. A consequence is
     that, if the design matrix is non-orthogonal and the original fit
     included quality weights or missing values, then the unscaled
     standard deviations produced by this function are approximate
     rather than exact. The approximation is usually acceptable. The
     results are always exact if the original fit was a oneway model.

... which pretty much sums it all up. The first design matrix (with an intercept) does not have orthogonal column vectors, and your analysis involves observation weights; so yes, you end up using the approximation.

ADD COMMENTlink written 5 weeks ago by Aaron Lun21k
gravatar for Gordon Smyth
5 weeks ago by
Gordon Smyth35k
Walter and Eliza Hall Institute of Medical Research, Melbourne, Australia
Gordon Smyth35k wrote:

Aaron has pointed you to do the reason and the note in the documentation. This question has actually been asked a number of times over the years. Completely removing this issue from limma would require a substantial rewrite to the code and some changes to the user-interface, as well as slower performance, and I haven't managed to find a satisfactory way to do it.

In your example, the standard errors and p-values or the DrugY-DrugX contrast with intercept are approximate. All the other contrasts are exact. The contrasts without intercept are all exact, as are the DrugX and DrugY contrasts with the intercept.

ADD COMMENTlink modified 5 weeks ago • written 5 weeks ago by Gordon Smyth35k

Thanks a lot for the explanation! Very good to know. I guess in a perfect world, this would be in the limma documentation (maybe it already is, actually), but thanks again for explaining it.

ADD REPLYlink written 5 weeks ago by alexgraehl20

(I guess that's what the ? explanation is, I just didn't understand that it would have appreciable effects on a small dataset.)

ADD REPLYlink written 5 weeks ago by alexgraehl20



Hi, Gordon 

I just wanted to clarify a bit the technical side of the issue. I apologize if this has also been answered before, and please feel free to point me to the existing answer, if one exists. If I understand the code correctly, the problem arises in in the following few lines:

if (orthog) 
  fit$stdev.unscaled <- sqrt(fit$stdev.unscaled^2 %% contrasts^2) 
else { 
  R <- chol(cormatrix) 
  ngenes <- NROW(fit$stdev.unscaled) 
  ncont <- NCOL(contrasts) 
  U <- matrix(1, ngenes, ncont, dimnames = list(rownames(fit$stdev.unscaled), colnames(contrasts))) 
  o <- array(1, c(1, ncoef)) 
  for (i in 1:ngenes) { 
    RUC <- R %% .vecmat(fit$stdev.unscaled[i, ], contrasts) 
    U[i, ] <- sqrt(o %% RUC^2) 
  fit$stdev.unscaled <- U 

Here, matrix R is the Choleski decomposition of correlation version of $(X'X)^{-1}$, where X is the design matrix. I believe that to get the correct values of "stdev.unscaled", matrix R should be created inside of the for (i in 1:ngenes) loop and be the Choleski decomposition of $(X'W_iX)^{-1}$, where i is a gene index and $W_i$ is the gene-specific matrix of observational weights.

It would look something like this (but more efficient and with bells and whistles) inside of the loop:

R <- chol( 
      t(fit$design) %% diag(ORIGINAL_VOOM_OBJECT$weights[i, ]) %*% fit$design 

Is this right? If so, it seems to me that is not able to perform the correct calculation (apart from any computational performance concerns) because it does not have access to the observation weights which are not retained by the output of lmFit().

A solution would be to have lmFit() output keep this information which would break its current structure. Is that what you mean by "substantial rewrite to the code and some changes to the user-interface"? Or is it just the concern about performing this (or similar) operation inside of a loop (for what it's worth, I'm not sure that itself would add much compute time, but I could be wrong).

Apologies for pressing the point, but I just want to get a clear understanding and appreciation of the concerns that prevent you from removing this issue from limma.

Thanks a lot for developing such a great tool and for the ongoing maintenance and support!


ADD REPLYlink modified 5 weeks ago • written 5 weeks ago by Oleg Mayba20

That is part of what I mean, but only the tip of the iceberg. An updated limma would need to store genewise QR-decompositions. As well as that, it requires a rewrite of the F-statistic computation, the subsetting of fit objects and other things.

There are lots of special cases. Your outlined code doesn't allow for missing y values, nor for the possibility that a blocking variable and correlation might have been set when lmFit was run, nor for the possibility the linear model was fit by robust regression.

The computation you outline can take a lot of time if the number of genes/predictors/samples is very large. is currently almost instantaneous but the new code will repeat almost all the computation already done by lmFit(). That's the least of my concerns however.

I'm not going to list all the limma functions that have to be changed in this discussion forum. I have plans to migrate limma in this way, but it will take time as limma is a big package with a lot of interlocking parts.

ADD REPLYlink modified 5 weeks ago • written 5 weeks ago by Gordon Smyth35k

Thanks for the detailed reply! 

Indeed, ensuring that the change propagates properly to all of the functionality supported by limma seems like quite an undertaking. 


ADD REPLYlink written 5 weeks ago by Oleg Mayba20
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.2.0
Traffic: 430 users visited in the last hour