Question: One-condition limma test
0
gravatar for M.Gierlinski
23 months ago by
United Kingdom
M.Gierlinski0 wrote:

Consider an experiment (e.g. proteomics) producing ratios. Our null hypothesis is that each ratio is one. In this particular example, I have a data table with n rows (peptides) and m columns (replicates). I would like to use limma to do a differential expression of each peptide versus a constant of one. This should be a generalisation of a one-sample t-test.

I'm not quite sure how to build data and design matrix for such test. With only one condition I struggle to write a linear equation.

What I did, and I have a feeling is not a correct approach, is to add a column of ones to my data and marked it as a separate condition, as in this generated example.

library(limma)

set.seed(1)
ndat <- 10
nrep <- 3

# rows progressively diverging from the null hypothesis
tab <- t(sapply(1:ndat, function(i) rnorm(nrep, mean=1 + 0.5 * (i - 1)/ndat, sd=0.1)))
# add a reference column of 1
dat <- cbind(1, tab)

meta <- data.frame(condition = c("1", rep("A", nrep)))
design <- model.matrix(~condition, meta)

# limma test
fit <- limma::lmFit(dat, design)
ebay <- limma::eBayes(fit)
res <- limma::topTable(ebay, coef="conditionA", adjust="BH", sort.by="none")

This is probably not quite right as it corresponds, at a peptide level, to a two-sample t-test, where the first sample contains only one element (1). This certainly does not correspond to a one-sample t-test against mu = 1.

An independent one-sample t-test on these data does this:

pt <- apply(tab, 1, function(x) t.test(x, mu=1, alternative="two.sided")$p.value)
plot(log10(res$P.Value), log10(pt))

If you re-do the above example for nrep=30, you will see that the limma p-values are much more conservative than those from a one-sample t-test. Since data are normal and the number of replicates is large, I would expect similar results from any test.

Can someone think of a better way of doing a one-condition test against a constant with limma?

limma t-test • 452 views
ADD COMMENTlink modified 23 months ago by James W. MacDonald51k • written 23 months ago by M.Gierlinski0
Answer: One-condition limma test
2
gravatar for Ryan C. Thompson
23 months ago by
The Scripps Research Institute, La Jolla, CA
Ryan C. Thompson7.4k wrote:

Limma typically works with data on a log scale, so I think you would want to work with log ratios. If you take the log of your data, then your null hypothesis becomes logratio = 0. You can test this hypothesis by fitting an intercept-only model (i.e. ~1) and testing whether the sole coefficient is nonzero.

dat <- log2(tab)
design <- cbind(Intercept=rep(1, ncol(dat)))
fit <- lmFit(dat, design)
fit <- eBayes(fit)
topTable(fit, coef="Intercept")

If you have a more complicated setup in which each observation is a log-ratio rather than an abundance measurement, you can use limma's modelMatrix (note: different from model.matrix) function, which was designed for 2-color arrays.

ADD COMMENTlink written 23 months ago by Ryan C. Thompson7.4k

This makes sense. Certainly, log-ratios are more symmetric (as mentioned by James below) and the null hypothesis of zero allows for the intercept-only model. Thanks!

 

ADD REPLYlink written 23 months ago by M.Gierlinski0
Answer: One-condition limma test
2
gravatar for James W. MacDonald
23 months ago by
United States
James W. MacDonald51k wrote:

Ratios aren't symmetric around 1, so you shouldn't be using those data as input anyway. Instead you should take logs (probably base 2, for interpretability), which will make your ratios symmetric, and will also make the mean under the null equal to zero, in which case you can proceed directly.

And do note that the p-values won't be directly comparable anyway, due to the fact that you are doing empirical Bayes moderation of the variance estimates. Plus you aren't adjusting the t-stat p-values for multiplicity, but you are adjusting those for limma.

> tab <- log2(tab)

> pt <- apply(tab, 1, function(x) t.test(x, alternative="two.sided")$p.value)

> fit <- lmFit(dat)
> res <- eBayes(fit)

## convert to 'regular' t-stats
> tstat.ord <- res$coef/res$st.dev.unscaled/res$sigma

> data.frame(Regular = pt, limma = 2*pt(-abs(tstat.ord), 2), limmaModerated = topTable(res, adjust = "none", sort.by = "none")$P.Value)
        Regular           limma      limmaModerated
1  0.2988531584 0.2988531584   0.2163826978
2  0.3437222974 0.3437222974   0.1863790403
3  0.0018106705 0.0018106705   0.0020170729
4  0.0525491928 0.0525491928   0.0076232404
5  0.2742862056 0.2742862056   0.1154152635
6  0.0102604941 0.0102604941   0.0007851966
7  0.0004774883 0.0004774883   0.0001029651
8  0.0544338857 0.0544338857   0.0058874218
9  0.0024360677 0.0024360677   0.0001232390
10 0.0134288607 0.0134288607   0.0006044474

 

 

ADD COMMENTlink written 23 months ago by James W. MacDonald51k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 223 users visited in the last hour