Hi,
I only noticed this by chance, but it peaked my curiosity and couldn't find any documentation about it. I occasionally get asked about experiments where conditions are in singlet (no replication), and my typical mandate is to run the line of "no replication, no statistics", however in the case of making the most of data available, I was aware of DESeq2's single sample method in which all samples are treated as a single group for dispersion estimation. When running the same singlet design in Limma, I expected an error when testing the model fit, but I got nothing.
So my question is, what's Limma doing in the case of a singlet experiment A-B, as my assumption was that the moderated t-test couldn't be calculated without replication?
Some code just to make a reproducible example:
library(limma) set.seed(1234) pheno_table <- data.frame(Rep = c(paste0("A",1:8)), Variable = c("A","B","C","C","C","D","D","D")) mat_rows <- 100 matrix_in <- matrix(log2(rnorm(n = (mat_rows*nrow(pheno_table)), mean = 100, sd = 30)), nrow = mat_rows) design <- model.matrix(~0 + Variable, data = pheno_table) colnames(design) <- gsub("Variable", "", colnames(design)) contrasts <- c("A_Vs_B" = "B-A") contrast_mat <- makeContrasts(contrasts = contrasts, levels = colnames(design)) colnames(contrast_mat) <- names(contrasts) fit <- eBayes(contrasts.fit(lmFit(matrix_in, design),contrast_mat)) topTable(fit, coef = 1, number = 5, p.value = 0.1) # logFC AveExpr t P.Value adj.P.Val B # 92 -2.909594 6.21836 -4.935319 0.0009739746 0.09739746 -0.3183399
Thanks for suggesting the edgeR manual, it's nice to have a place where it's written down that I can point people to. I think it'd be useful for Limma to possibly output a warning where singlet comparisons are made stipulating a summary of what you've said. Thanks though for the comprehensive response!
Singlet comparisons (n=1 in some groups but n>1 in others) are quite routine and I don't see any reason for a warning from limma. The complications that Aaron discussed above relate to DESeq2's single sample method, which is quite a different thing, and he was explaining why it isn't used in limma. By contrast, limma's method for your experiment is standard linear modelling and does not require any such compromises.
DESeq2's single sample method and edgeR's no replicate methods are really just red herrings here. Your experiment does have replication and none of the packages (limma, edgeR or DESeq2) would need to resort to a "no replicate" method for your design.