Hi all,
I have an intercept issue that I do not understand with limma.
Let's say that I have a simple model with one group 'cancer sub-type' which contains 4 factors (TNBC, NonTNBC, HER2, Control).
I wrote a model which has either an intercept or not. If I add an intercept, the Control factor is used as reference.
In term of contrast, it means that testing 'subtypeTNBC' in the model with intercept is the same than testing 'subtypeTNBC - subtypeControl' in the model without intercept. Both tests compare the mean expression of TNBC vs Control groups.
However, if I want now to compare two other groups, let's say NonTNBC and TNBC ... using the same contrasts (as the Intercept is expected to correspond to the Control group), I have different results ...
Many thanks for your help
N
design <- model.matrix(~ 0 + subtype, data=splan)
v <- voom(y, design, plot=FALSE)
fit <- lmFit(v, design)
contrast <-makeContrasts(subtypeTNBC - subtypeNonTNBC, levels=design)
fit <- contrasts.fit(fit, contrast)
eb.wI <- eBayes(fit)
res.wI <- topTable(eb.wI, number=1e6, adjust.method="BH")
design.noI <- model.matrix(~ subtype, data=splan)
v <- voom(y, design.noI, plot=FALSE)
fit <- lmFit(v, design.noI)
contrast <-makeContrasts(subtypeTNBC - subtypeNonTNBC, levels=design.noI)
fit <- contrasts.fit(fit, contrast)
eb.noI <- eBayes(fit)
res.noI <- topTable(eb.noI, number=1e6, adjust.method="BH")

constrasts.fit() is always exact when the original fit was a oneway layout with the "~ 0 + etc" group mean parametrization:
> topTable(fit.out2) logFC AveExpr t P.Value adj.P.Val B 7663 3.919609 0.27630782 5.115955 2.805395e-05 0.2070274 2.1636498 9605 3.197820 -0.28132385 4.964980 4.140547e-05 0.2070274 1.9756614 5614 3.007006 -0.13776775 4.687533 8.481148e-05 0.2527696 1.4164484 8374 -3.653325 0.01628550 -4.589608 1.092572e-04 0.2527696 0.9661284 4207 -2.772304 0.29816848 -4.416522 1.709205e-04 0.2527696 0.8309572 5306 2.380814 0.47113940 4.381388 1.871601e-04 0.2527696 0.8081607 3095 -3.125466 -0.34076551 -4.393183 1.815435e-04 0.2527696 0.7136030 9732 -2.921570 -0.02915717 -4.280462 2.428584e-04 0.2527696 0.4229538 171 3.193572 0.73805534 4.149116 3.406531e-04 0.2527696 0.1894304 4513 -2.802365 0.10369028 -4.122553 3.647409e-04 0.2527696 0.1373498 > g <- relevel(g,ref="4") > X.in <- model.matrix(~g) > fit.in <- lmFit(y, X.in, weights=w) > fit.in <- eBayes(fit.in) > topTable(fit.in, coef="g5") logFC AveExpr t P.Value adj.P.Val B 7663 3.919609 0.27630782 5.115955 2.805395e-05 0.2070274 2.1636498 9605 3.197820 -0.28132385 4.964980 4.140547e-05 0.2070274 1.9756614 5614 3.007006 -0.13776775 4.687533 8.481148e-05 0.2527696 1.4164484 8374 -3.653325 0.01628550 -4.589608 1.092572e-04 0.2527696 0.9661284 4207 -2.772304 0.29816848 -4.416522 1.709205e-04 0.2527696 0.8309572 5306 2.380814 0.47113940 4.381388 1.871601e-04 0.2527696 0.8081607 3095 -3.125466 -0.34076551 -4.393183 1.815435e-04 0.2527696 0.7136030 9732 -2.921570 -0.02915717 -4.280462 2.428584e-04 0.2527696 0.4229538 171 3.193572 0.73805534 4.149116 3.406531e-04 0.2527696 0.1894304 4513 -2.802365 0.10369028 -4.122553 3.647409e-04 0.2527696 0.1373498The results from the direct fit.in model above agrees with those from your fit.out model and both are exact.
In general, contrasts.fit() is always exact when there are no weights or when the coefficients being compared in the contrast are statistically independent.