I am a bit confused on how to perform and/or interpret the results of an ANOVA-like analysis in limma. I would like to compare 3 pairwise contrasts, essentially as described in paragraph 9.3 of the limma users guide. My code is running fine, and the output of the eBayes
function nicely returns the (moderated) F.p.value
as well as the p.values
of the 3 pairwise contrasts. So far, so good.
However, I assumed that the number of significantly genes identified by the F-test should equal the number of genes identified in any of the 3 contrasts... but I noticed this is not the case: the number of genes identified by the F-test is 3422, whereas this number for any of the 3 contrasts is much larger, namely 4438. Again, I expected that these numbers should be the same.
Question: is this difference to be expected (if so, why), or am I doing something wrong?
Thanks,
Guido
> x.norm ExpressionSet (storageMode: lockedEnvironment) assayData: 22690 features, 9 samples Annotation: moe430a > targets FileName Strain Treatment 1 KOcontrol2.CEL KO Con 2 KOcontrol3.CEL KO Con 3 KOcontrol5.CEL KO Con 7 WTcontrol20.CEL WT Con 8 WTcontrol21.CEL WT Con 9 WTcontrol22.CEL WT Con 10 WTWY25.CEL WT WY 11 WTWY26.CEL WT WY 12 WTWY27.CEL WT WY > > G.T <- factor(paste(targets$Strain, targets$Treatment, sep=".")) > > design <- model.matrix(~0+G.T) > colnames(design) <- levels(G.T) > design KO.Con WT.Con WT.WY 1 1 0 0 2 1 0 0 3 1 0 0 4 0 1 0 5 0 1 0 6 0 1 0 7 0 0 1 8 0 0 1 9 0 0 1 attr(,"assign") [1] 1 1 1 attr(,"contrasts") attr(,"contrasts")$G.T [1] "contr.treatment" > > > cont.matrix <- makeContrasts( + comp1 = (WT.Con - KO.Con), + comp2 = (WT.WY - KO.Con), + comp3 = (WT.WY - WT.Con), + levels=design) > > # First fit design > fit <- lmFit(x.norm,design) > > > # fit all contrasts > fit2 <- contrasts.fit(fit, cont.matrix) > > # Perform moderated (Bayesian smoothing) T-test > # Use trend=TRUE for intensity-based smoothing of prior, > # and robust=TRUE to protect against outlier genes. > fit3 <- eBayes(fit2, trend=TRUE, robust=TRUE) > > # Number of sign. genes in ANOVA (p<0.01) > dim(fit3[(fit3$F.p.value <0.01),]) [1] 3422 3 > > > > > # Perform pairwise, 'post-hoc'-like testing (p<0.01) > pairwise.sign <- decideTests(fit3, method="separate",adjust.method="none",p.value=0.01) > summary(pairwise.sign) comp1 comp2 comp3 Down 243 2274 1802 NotSig 22190 19049 19500 Up 257 1367 1388 > > #Number of probesets regulated in AT LEAST 1 contrast > dim(fit3[rowSums(abs(pairwise.sign)) !=0,]) [1] 4439 3 > > #Numbers F-test and pairwise comparisons don't match > # 3342 versus 4439... ???
Thanks you both for your valuable answers!
@ Ryan: method="nestedF" indeed gives the same number of genes.
@ Gordon: yes, point taken! I was so caught up with details that I could not see the wood for the trees...