Consider an experiment (e.g. proteomics) producing ratios. Our null hypothesis is that each ratio is one. In this particular example, I have a data table with n rows (peptides) and m columns (replicates). I would like to use limma to do a differential expression of each peptide versus a constant of one. This should be a generalisation of a one-sample t-test.
I'm not quite sure how to build data and design matrix for such test. With only one condition I struggle to write a linear equation.
What I did, and I have a feeling is not a correct approach, is to add a column of ones to my data and marked it as a separate condition, as in this generated example.
library(limma) set.seed(1) ndat <- 10 nrep <- 3 # rows progressively diverging from the null hypothesis tab <- t(sapply(1:ndat, function(i) rnorm(nrep, mean=1 + 0.5 * (i - 1)/ndat, sd=0.1))) # add a reference column of 1 dat <- cbind(1, tab) meta <- data.frame(condition = c("1", rep("A", nrep))) design <- model.matrix(~condition, meta) # limma test fit <- limma::lmFit(dat, design) ebay <- limma::eBayes(fit) res <- limma::topTable(ebay, coef="conditionA", adjust="BH", sort.by="none")
This is probably not quite right as it corresponds, at a peptide level, to a two-sample t-test, where the first sample contains only one element (1). This certainly does not correspond to a one-sample t-test against mu = 1.
An independent one-sample t-test on these data does this:
pt <- apply(tab, 1, function(x) t.test(x, mu=1, alternative="two.sided")$p.value) plot(log10(res$P.Value), log10(pt))
If you re-do the above example for nrep=30, you will see that the limma p-values are much more conservative than those from a one-sample t-test. Since data are normal and the number of replicates is large, I would expect similar results from any test.
Can someone think of a better way of doing a one-condition test against a constant with limma?