Hi fellow Limma fans,
I can't imagine that this is actually a bug, but is is a REALLY odd situation in which one contrast give me different results depending on how it is formulated.
If you (like me) have been assuming that it doesn't matter if you make your limma design matrix with an explicit contrast or an implicit one, well, check again—you may get different results!
(Note that these are NOT due to numeric under/overflow.)
See if you can solve the mystery... I couldn't!
Below is a minimal R test case that relies on no external data and generates a scatterplot.
require(limma); require(edgeR) # edgeR is for DGEList set.seed(1234) counts <- data.frame(matrix(as.integer(10*rexp(9*50)), ncol=9)) colnames(counts) <- c("CTRL1", "CTRL2", "CTRL3", "DrugX_1", "DrugX_2", "DrugX_3", "DrugY_1", "DrugY_2", "DrugY_3") rownames(counts) <- paste0("Gene_", 1:nrow(counts)) counts[5:15 , grepl("DrugX", colnames(counts))] = 4 + counts[5:15 , grepl("DrugX", colnames(counts))] * 5 # fake diff expressed genes counts[30:40 , grepl("DrugY", colnames(counts))] = 2 + counts[30:40 , grepl("DrugX", colnames(counts))] * 12 # fake diff expressed genes counts[13:25 , grepl("Drug.*", colnames(counts))] = 3 + counts[13:25 , grepl("Drug.*", colnames(counts))] * 8 # some more fake stuff heatmap(as.matrix(log2(1+counts)), scale="none") # show the data metadata=data.frame("SAMPLE_ID"=colnames(counts), "GROUP"=c(rep("CTRL",3), rep("DrugX",3), rep("DrugY",3))) pvals = list(); logFC = list() for (STYLE in c("NO_INTERCEPT", "WITH_INTERCEPT")) { if (STYLE == "WITH_INTERCEPT") { design.mat = model.matrix(~ GROUP, metadata) colnames(design.mat) <- c("INTERCEPT","DrugX", "DrugY") contrast.mat <- limma::makeContrasts(DrugX , DrugY , DrugY-DrugX, levels=design.mat) # don't subtract the intercept! comparisons = c("DrugX", "DrugY", "DrugY - DrugX") } else if (STYLE=="NO_INTERCEPT") { design.mat = model.matrix(~ 0 + GROUP, metadata) colnames(design.mat) <- gsub("^GROUP", "", colnames(design.mat)) contrast.mat <- limma::makeContrasts(DrugX-CTRL, DrugY-CTRL, DrugY-DrugX, levels=design.mat) comparisons = c("DrugX - CTRL", "DrugY - CTRL", "DrugY - DrugX") } else { stop("unrecognized 'STYLE'") } #contrast.mat dge = DGEList(counts=counts, group=as.factor(metadata$GROUP)) dubious_comparison = comparisons[3] # the drugY - drugX one print(paste0("Running the following comparison: ", dubious_comparison)) voomObj = voom(dge, design=design.mat, lib.size=dge$samples$lib.size, plot=TRUE) fit = limma::lmFit(object=voomObj, design=voomObj$design) fit = limma::contrasts.fit(fit, contrast.mat) # or remove this line to go back to the previous method fit = limma::eBayes(fit) toptab = limma::topTable(fit=fit, coef=dubious_comparison, number=Inf, sort.by='p', confint=TRUE) pvals[[STYLE]] = toptab$P.Value[order(rownames(toptab))] logFC[[STYLE]] = toptab$logFC[order(rownames(toptab))] print(paste0("If we use the style where our design matrix is set up with: ", STYLE, ", then our results are (note the P.Value):")) print(head(toptab)) } par(mfrow=c(1,2)) # 2 plots par(pty='s') # square plot(logFC[["NO_INTERCEPT"]], logFC[["WITH_INTERCEPT"]], col="blue", pch=22, main="logFC comparison (always perfectly on the line)") abline(a=0, b=1, col="#0000FF55") par(pty='s') # square plot(pvals[["NO_INTERCEPT"]], pvals[["WITH_INTERCEPT"]], col="red", pch=19, main="P-value comparison (NOT on the line!)") abline(a=0, b=1, col="#0000FF55")
Below: the image compares the two methods. Fold change is the same, but (for some reason) the P-values are different, so I assume that the computed variance being different is the culprit.
Thanks a lot for the explanation! Very good to know. I guess in a perfect world, this would be in the limma documentation (maybe it already is, actually), but thanks again for explaining it.
(I guess that's what the ?contrasts.fit explanation is, I just didn't understand that it would have appreciable effects on a small dataset.)
Hi, Gordon
I just wanted to clarify a bit the technical side of the issue. I apologize if this has also been answered before, and please feel free to point me to the existing answer, if one exists. If I understand the code correctly, the problem arises in
limma::contrasts.fit()
in the following few lines:Here, matrix R is the Choleski decomposition of correlation version of $(X'X)^{-1}$, where X is the design matrix. I believe that to get the correct values of "stdev.unscaled", matrix R should be created inside of the
for (i in 1:ngenes)
loop and be the Choleski decomposition of $(X'W_iX)^{-1}$, where i is a gene index and $W_i$ is the gene-specific matrix of observational weights.It would look something like this (but more efficient and with bells and whistles) inside of the loop:
Is this right? If so, it seems to me that limma::contrasts.fit() is not able to perform the correct calculation (apart from any computational performance concerns) because it does not have access to the observation weights which are not retained by the output of lmFit().
A solution would be to have lmFit() output keep this information which would break its current structure. Is that what you mean by "substantial rewrite to the code and some changes to the user-interface"? Or is it just the concern about performing this (or similar) operation inside of a loop (for what it's worth, I'm not sure that itself would add much compute time, but I could be wrong).
Apologies for pressing the point, but I just want to get a clear understanding and appreciation of the concerns that prevent you from removing this issue from limma.
Thanks a lot for developing such a great tool and for the ongoing maintenance and support!
Oleg.
That is part of what I mean, but only the tip of the iceberg. An updated limma would need to store genewise QR-decompositions. As well as that, it requires a rewrite of the F-statistic computation, the subsetting of fit objects and other things.
There are lots of special cases. Your outlined code doesn't allow for missing y values, nor for the possibility that a blocking variable and correlation might have been set when lmFit was run, nor for the possibility the linear model was fit by robust regression.
The computation you outline can take a lot of time if the number of genes/predictors/samples is very large. contrasts.fit() is currently almost instantaneous but the new code will repeat almost all the computation already done by lmFit(). That's the least of my concerns however.
I'm not going to list all the limma functions that have to be changed in this discussion forum. I have plans to migrate limma in this way, but it will take time as limma is a big package with a lot of interlocking parts.
Thanks for the detailed reply!
Indeed, ensuring that the change propagates properly to all of the functionality supported by limma seems like quite an undertaking.