Question: limma -- posthoc analysis when design factor has 3 levels
gravatar for mforde84
8 days ago by
mforde840 wrote:


I'm analysing some microarray and rnaseq data for gene set enrichment analysis. I have three independent sample groups. I'm using GSVA to do the enrichment analysis then using limma to fit a model to the GSVA enrichment scores.

p <- 10 ## number of genes
n <- 45 ## number of samples
nGrp1 <- 15 ## number of samples in group 1
nGrp2 <- 15 ## number of samples in group 2
nGrp3 <- 15 ## number of samples in group 2
## consider three disjoint gene sets
geneSets <- list(set1=paste("g", 1:3, sep=""),
set2=paste("g", 4:6, sep=""),
set3=paste("g", 7:10, sep=""))
## sample data from a normal distribution with mean 0 and 1
y <- matrix(rnorm(n*p), nrow=p, ncol=n,
dimnames=list(paste("g", 1:p, sep="") , paste("s", 1:n, sep="")))
## genes in set1 are expressed at higher levels in the last 10 samples
y[geneSets$set1, (nGrp1+1):n] <- y[geneSets$set1, (nGrp1+1):n] + 2
## build design matrix
design <- cbind(intercept=1, groups=c(rep(0, nGrp1), rep(1, nGrp2), rep(2, nGrp3)))
## estimate GSVA enrichment scores for the three sets
gsva_es <- gsva(y, geneSets, mx.diff=1)$es.obs
## fit linear model to the GSVA enrichment scores
fit <- lmFit(gsva_es, design)
## estimate moderated t-statistics
fit <- eBayes(fit)
## set1 is differentially expressed
topTable(fit, coef="groups")

This will give me the test-wise enrichment values and p-value, however I would like to do all additional group comparisons, e.g., between group 1vs2, 1vs3, and 2vs3. Is there a way to go about doing this with limma?

ADD COMMENTlink modified 8 days ago by Aaron Lun13k • written 8 days ago by mforde840
gravatar for Aaron Lun
8 days ago by
Aaron Lun13k
Cambridge, United Kingdom
Aaron Lun13k wrote:

I don't know about the GSVA part, but as for the limma code, your current design matrix is probably wrong. This is because it encodes the group IDs as a real-valued covariate rather than as a factor. For example, if you have a gene that is differentially expressed between groups, your current design expects that the log-fold change between groups 1 and 2 is the same as the log-fold change between groups 2 and 3. I doubt this is what you intended - rather, you should be doing something like this to set up your design:

groups <- factor(rep(1:3, c(nGrp1, nGrp2, nGrp3))
design <- model.matrix(~0 + groups)

Each coefficient of design represents the average (log-)expression in each group. It's simple to then use makeContrasts and to formulate comparisons between pairs of groups (or to do an ANOVA), followed by and eBayes to test those contrasts. I'll refer you to the documentation and the user's guide for more details, but as an example:

con <- makeContrasts(groups1 - groups2, levels=design) # DE between groups 1 and 2
fit2 <-, con)
fit2 <- eBayes(fit2)
ADD COMMENTlink modified 8 days ago • written 8 days ago by Aaron Lun13k

Thanks for the help.


ADD REPLYlink written 6 days ago by mforde840
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.2.0
Traffic: 167 users visited in the last hour