Question

Design and contrast question limma (additive or nested or duplicateCorrelation())??

1

Entering edit mode

Guido Hooiveld ★ 4.1k

@guido-hooiveld-2020

Last seen 7 weeks ago

Wageningen University, Wageningen, the …

Apologies for yet again a question on how to properly define the design and contrasts in limma for an experiment... However, despite reading the docs, searching the support forum, and Googling, I am still not clear how to properly do this for my experiment.

I have array data from a mouse experiment in which the treatments were made on dams (2 diets; LFD and W), but the outcomes were measured in livers of all offspring of these dams, in both male and female pups. Obviously I would like to take the ‘dam/litter effect’ into account. Question: how to define the design and contrasts to do this?

Research questions to be answered:

Which genes are regulated in the pups by the diets of the mother?
Is there a diet effect in each gender (sexe)?
Is there a differential response between the two genders?

Initially I fitted an additive model (group means + litter), but then I faced the infamous “coefficients not estimable” warning. Still i identified genes for the above two questions, although topTable didn't like it.

However, I also realized that this is a nested design because dam/litter is nested with diets. After renumbering the litters, i fitted a ‘traditional’ model (including intercept). All coefficients were estimable, but now I am not sure which coefficients to compare to answer my research questions....

Finally, a long time ago Gordon recommended the use of duplicateCorrelation() for an experiment with *similar* design. Could this also (best) applied here?

Any feedback would be appreciated!

Thanks,

Guido

cel means additive model

targets (used with additive model):

Filename    Sexe    Diet    Litter
sample1    M    LFD    L37
sample2    M    LFD    L37
sample3    M    LFD    L49
sample4    M    LFD    L49
sample5    M    LFD    L50
sample6    M    LFD    L50
sample7    M    WD    L48
sample8    M    WD    L48
sample9    M    WD    L48
sample10    M    WD    L48
sample11    M    WD    L40
sample12    M    WD    L40
sample13    F    LFD    L49
sample14    F    LFD    L50
sample15    F    LFD    L37
sample16    F    LFD    L37
sample17    F    LFD    L37
sample18    F    LFD    L49
sample19    F    WD    L39
sample20    F    WD    L39
sample21    F    WD    L40
sample22    F    WD    L40
sample23    F    WD    L48
sample24    F    WD    L48


> SD <- as.factor(paste(targets$Sexe, targets$Diet, sep="."))
> Litter <- factor(targets$Litter)
>
>
> design <- model.matrix(~0+SD+Litter)
> design
   SDF.LFD SDF.WD SDM.LFD SDM.WD LitterL39 LitterL40 LitterL48 LitterL49 LitterL50
1        0      0       1      0         0         0         0         0         0
2        0      0       1      0         0         0         0         0         0
3        0      0       1      0         0         0         0         1         0
4        0      0       1      0         0         0         0         1         0
5        0      0       1      0         0         0         0         0         1
6        0      0       1      0         0         0         0         0         1
7        0      0       0      1         0         0         1         0         0
8        0      0       0      1         0         0         1         0         0
9        0      0       0      1         0         0         1         0         0
10       0      0       0      1         0         0         1         0         0
11       0      0       0      1         0         1         0         0         0
12       0      0       0      1         0         1         0         0         0
13       1      0       0      0         0         0         0         1         0
14       1      0       0      0         0         0         0         0         1
15       1      0       0      0         0         0         0         0         0
16       1      0       0      0         0         0         0         0         0
17       1      0       0      0         0         0         0         0         0
18       1      0       0      0         0         0         0         1         0
19       0      1       0      0         1         0         0         0         0
20       0      1       0      0         1         0         0         0         0
21       0      1       0      0         0         1         0         0         0
22       0      1       0      0         0         1         0         0         0
23       0      1       0      0         0         0         1         0         0
24       0      1       0      0         0         0         1         0         0
attr(,"assign")
[1] 1 1 1 1 2 2 2 2 2
attr(,"contrasts")
attr(,"contrasts")$SD
[1] "contr.treatment"

attr(,"contrasts")$Litter
[1] "contr.treatment"

> fit <- lmFit(x.norm,design)
Coefficients not estimable: LitterL48
Warning message:
Partial NA coefficients for 12332 probe(s)
> cont.matrix <- makeContrasts(
+ WD_LFD = ( (SDF.WD + SDM.WD)/2 - (SDF.LFD + SDM.LFD)/2 ), #diet main effect
+ WD.F_LFD.F = (SDF.WD - SDF.LFD), #diet efefct in female pups
+ WD.M_LFD.M = (SDM.WD - SDM.LFD), #diet effect in male pups
+ delta.sexe = ((SDF.WD - SDF.LFD) - (SDM.WD - SDM.LFD)), #sexe-dep genes
+ levels=design)
>
> fit2 <- contrasts.fit(fit, cont.matrix)
> fit2 <- eBayes(fit2, trend=TRUE, robust=TRUE)
> topTable(fit2)
Error in topTableF(fit, number = number, genelist = genelist, adjust.method = adjust.method,  :
  F-statistics not found in fit
> topTable(fit2,coef=1)
               logFC  AveExpr         t      P.Value   adj.P.Val        B
625599_at -0.7151058 6.119586 -6.990635 2.479186e-07 0.001589408 6.373436
11303_at  -0.5876455 7.568066 -6.974488 2.577697e-07 0.001589408 6.341612
20787_at  -0.3881564 9.571703 -6.511793 7.979145e-07 0.002118622 5.411698
15202_at  -0.9022215 4.951477 -6.781162 8.324025e-07 0.002118622 5.347300
381524_at -1.0326256 7.693721 -7.155681 8.589938e-07 0.002118622 5.269775
54712_at  -0.5565909 6.665185 -6.183399 1.805718e-06 0.003489665 4.731520
15233_at  -1.4009764 8.011086 -7.054168 1.980835e-06 0.003489665 4.565668
108956_at  0.5669231 8.105670  6.074023 2.934636e-06 0.004523742 4.321340
19013_at  -0.4013013 8.215255 -5.681969 6.415447e-06 0.008790587 3.663809
20855_at  -0.4206979 6.216767 -5.579242 8.340307e-06 0.010285266 3.441188
>

targets for nested design; litter renumbered

Filename    Sexe    Diet    Litter
sample1    M    LFD    1
sample2    M    LFD    1
sample3    M    LFD    2
sample4    M    LFD    2
sample5    M    LFD    3
sample6    M    LFD    3
sample7    M    WD    3
sample8    M    WD    3
sample9    M    WD    3
sample10    M    WD    3
sample11    M    WD    2
sample12    M    WD    2
sample13    F    LFD    2
sample14    F    LFD    3
sample15    F    LFD    1
sample16    F    LFD    1
sample17    F    LFD    1
sample18    F    LFD    2
sample19    F    WD    1
sample20    F    WD    1
sample21    F    WD    2
sample22    F    WD    2
sample23    F    WD    3
sample24    F    WD    3


> Diet <-relevel(as.factor(targets$Diet), ref="LFD")
> Sexe <- relevel(factor(targets$Sexe), ref="M")
> Litter <- factor(targets$Litter)
>
> design <- model.matrix(~Diet+Diet:Sexe+Diet:Litter)
>
> colnames(design) <- make.names(colnames(design))
> design
   X.Intercept. DietWD DietLFD.SexeF DietWD.SexeF DietLFD.Litter2 DietWD.Litter2 DietLFD.Litter3 DietWD.Litter3
1             1      0             0            0               0              0               0              0
2             1      0             0            0               0              0               0              0
3             1      0             0            0               1              0               0              0
4             1      0             0            0               1              0               0              0
5             1      0             0            0               0              0               1              0
6             1      0             0            0               0              0               1              0
7             1      1             0            0               0              0               0              1
8             1      1             0            0               0              0               0              1
9             1      1             0            0               0              0               0              1
10            1      1             0            0               0              0               0              1
11            1      1             0            0               0              1               0              0
12            1      1             0            0               0              1               0              0
13            1      0             1            0               1              0               0              0
14            1      0             1            0               0              0               1              0
15            1      0             1            0               0              0               0              0
16            1      0             1            0               0              0               0              0
17            1      0             1            0               0              0               0              0
18            1      0             1            0               1              0               0              0
19            1      1             0            1               0              0               0              0
20            1      1             0            1               0              0               0              0
21            1      1             0            1               0              1               0              0
22            1      1             0            1               0              1               0              0
23            1      1             0            1               0              0               0              1
24            1      1             0            1               0              0               0              1
attr(,"assign")
[1] 0 1 2 2 3 3 3 3
attr(,"contrasts")
attr(,"contrasts")$Diet
[1] "contr.treatment"

attr(,"contrasts")$Sexe
[1] "contr.treatment"

attr(,"contrasts")$Litter
[1] "contr.treatment"

>
> fit <- lmFit(x.norm,design)
>

> # Intercept equals mean expression in LFD-male
>
> cont.matrix <- makeContrasts(
+ WD_LFD = ( (DietWD+DietWD.SexeF) - DietLFD.SexeF   ), #diet main effect??
+ WD.F_LFD.F = (DietWD.SexeF - DietLFD.SexeF), #diet efefct in female pups??
+ WD.M_LFD.M = (DietWD), #diet effect in male pups??
+ delta.sexe = ((DietWD.SexeF - DietLFD.SexeF) -  DietWD), #sexe-dep genes??
+ levels=design)
>
>
> fit2 <- contrasts.fit(fit, cont.matrix)
> fit2 <- eBayes(fit2, trend=TRUE, robust=TRUE)
> topTable(fit2)
              WD_LFD WD.F_LFD.F WD.M_LFD.M  delta.sexe   AveExpr        F      P.Value    adj.P.Val
22139_at   0.8232779  0.4584550  0.3648229  0.09363214 10.113522 38.30188 2.405464e-08 0.0002966418
17748_at  -0.6849334 -0.3371893 -0.3477441  0.01055483 11.319395 30.75738 1.799528e-07 0.0011095887
69036_at  -0.6605604 -0.2511031 -0.4094573  0.15835417 12.554694 29.08963 2.943635e-07 0.0012100302
20787_at  -0.5784347 -0.3335203 -0.2449145 -0.08860582  9.571703 27.47740 4.828340e-07 0.0014885772
226123_at  0.6573526  0.4589262  0.1984264  0.26049983  7.369280 24.79782 1.150587e-06 0.0028378068
11303_at  -0.7741369 -0.2470930 -0.5270440  0.27995100  7.568066 21.72466 3.375199e-06 0.0054942399
18703_at  -0.5938379 -0.3379725 -0.2558654 -0.08210715 10.945976 21.69258 3.415043e-06 0.0054942399
68371_at  -0.7809301 -0.5201257 -0.2608044 -0.25932122  9.174027 22.62103 3.564217e-06 0.0054942399
240660_at -0.5326507 -0.3816188 -0.1510319 -0.23058683  8.190418 20.47978 5.366370e-06 0.0073531199
237847_at -0.5386111 -0.4031175 -0.1354937 -0.26762380  7.751602 19.84829 6.835333e-06 0.0084293321
>

> topTable(fit2,coef=1)
               logFC   AveExpr         t      P.Value    adj.P.Val        B
22139_at   0.8232779 10.113522  8.199715 1.473255e-08 0.0001816818 9.153816
17748_at  -0.6849334 11.319395 -7.494534 7.470096e-08 0.0003140067 7.766977
69036_at  -0.6605604 12.554694 -7.485026 7.638826e-08 0.0003140067 7.747720
20787_at  -0.5784347  9.571703 -6.899323 3.091683e-07 0.0009531658 6.533630
11303_at  -0.7741369  7.568066 -6.532403 7.583463e-07 0.0018703853 5.746102
226123_at  0.6573526  7.369280  6.261047 1.487074e-06 0.0030564329 5.151217
242109_at  0.7940300  6.865846  6.180800 1.817511e-06 0.0030592229 4.973379
18703_at  -0.5938379 10.945976 -6.145715 1.984575e-06 0.0030592229 4.895362
68371_at  -0.7809301  9.174027 -6.057329 3.424477e-06 0.0046922949 4.420740
240660_at -0.5326507  8.190418 -5.647087 7.012635e-06 0.0077855083 3.770503
> # got stuck: both topTables for main diet effect do NOT match...

limma model matrix limma design matrix complex-design • 4.5k views

ADD COMMENT • link updated 10.0 years ago by Aaron Lun ★ 28k • written 10.0 years ago by Guido Hooiveld ★ 4.1k

score 4 · Answer 1 · 2015-06-18

4

Entering edit mode

Aaron Lun ★ 28k

@alun

Last seen 40 minutes ago

The city by the bay

Your diet is nested within the litter effect, so any differential expression between diets will be absorbed into the blocking factor for the litter. This is a good use case for duplicateCorrelation:

design <- model.matrix(~0 + SD)
colnames(design) <- levels(SD)
corfit <- duplicateCorrelation(x.norm, design, block=Litter)
fit <- lmFit(x.norm, design, block=Litter, correlation=corfit$consensus)

You can then test between groups to your heart's desire. To test between diets, you can do:

con <- makeContrasts(M.LFD - M.WD, levels=design) # For males
con <- makeContrasts(F.LFD - F.WD, levels=design) # For females
con <- makeContrasts((M.LFD + F.LFD)/2 - (M.WD + F.WD)/2, levels=design) # Average of both

You may or may not want to intersect the first two comparisons to find genes that change consistently between diets for both males and females. For the second question, to test between sexes, you can do:

con <- makeContrasts(M.LFD - F.LFD, levels=design) # For LFD
con <- makeContrasts(M.WD - F.WD, levels=design) # For WD
con <- makeContrasts((M.LFD + M.WD)/2 - (F.LFD + F.WD)/2, levels=design) # Average of both

Finally, to test for a differential response between sexes, you can do:

con <- makeContrasts((M.LFD - M.WD) - (F.LFD - F.WD), levels=design)

As an aside, for your second and third questions, a more elegant model would block on the litter effect directly:

> design2 <- model.matrix(~0 + Litter + SD)
> design2 <- design2[,-7]
> colnames(design2)
[1] "LitterL37" "LitterL39" "LitterL40" "LitterL48" "LitterL49" "LitterL50"
[7] "SDM.LFD"   "SDM.WD"

This design matrix can be used directly in lmFit without running it through duplicateCorrelations first. The first 6 terms represent the average expression level of the female mice in each litter. The final two terms represent the average log-fold change for the male mice over the females, in the LFD or WD-fed litters. You can then drop either of these terms to get the sex effect for each diet, or you could compare the sex effects between diets for your third question:

con <- makeContrasts(SDM.LFD - SDM.WD, levels=design2)

This approach models the litter effect explicitly, so it avoids some of the assumptions in duplicateCorrelations. That said, the duplicateCorrelations approach that I described above will also work quite well. It's up to you whether you want to make the effort to try out this other model specifically to answer your second and third questions.

ADD COMMENT • link 10.0 years ago Aaron Lun ★ 28k

1

Entering edit mode

Hi Aaron,

Thanks very much for your reply! I fully understand your duplicateCorrelation approach. However, I am (still) puzzled by your 2nd approach in which directly is blocked on litter:

Specifically, why do you drop the 7th coefficient (SDF.WD) from the design (and not another one)?

Also, i would appreciate if you could spend few more words on when to drop which term for what contrasts, and how you would define these.

Thanks for answering these (likely naive) questions.

ADD REPLY • link 10.0 years ago Guido Hooiveld ★ 4.1k

1

Entering edit mode

I dropped the 7th coefficient because it made the interpretation of the remaining terms easier. If I dropped SDM.WD and retained SDF.WD, the various Litter terms would represent average expression across all females for LFD litters but across males for WD litters. This is unnecessarily confusing, so I dropped SDF.WD to ensure that the Litter terms represent average expression across females for each litter.

As for the contrasts, you can just set coef=7 in topTable to test the sex effect across LFD litters. Similarly, you can set coef=8 to test the sex effect across WD litters. This makes sense, because the corresponding coefficients in design2 represent the sex effect of males over females for each diet.

ADD REPLY • link 10.0 years ago Aaron Lun ★ 28k

0

Entering edit mode

Hi Aaron,

I have doubt why dropping coef number 7 the final two terms represent the average log-fold change for male mice over female, in the LFD or WD-fed mice. The design matrix would be:

SDM.LFD	SDM.WD	Filename	Sexe	Diet
0	0	sample13	F	LFD
0	0	sample14	F	LFD
0	0	sample15	F	LFD
0	0	sample16	F	LFD
0	0	sample17	F	LFD
0	0	sample18	F	LFD
0	0	sample19	F	WD
0	0	sample20	F	WD
0	0	sample21	F	WD
0	0	sample22	F	WD
0	0	sample23	F	WD
0	0	sample24	F	WD
1	0	sample1	M	LFD
1	0	sample2	M	LFD
1	0	sample3	M	LFD
1	0	sample4	M	LFD
1	0	sample5	M	LFD
1	0	sample6	M	LFD
0	1	sample7	M	WD
0	1	sample8	M	WD
0	1	sample9	M	WD
0	1	sample10	M	WD
0	1	sample11	M	WD
0	1	sample12	M	WD

Do the last two terms represent SDM.LFD-SDF.LFD and SDM.WD-SDF.LFD? Am I wrong?

Best,

Keifa

ADD REPLY • link 7.0 years ago Keifa ▴ 10

0

Entering edit mode

Ask a new question.

ADD REPLY • link 7.0 years ago Aaron Lun ★ 28k

0

Entering edit mode

Done: Limma DesignMatrix, dropping terms

ADD REPLY • link 7.0 years ago Keifa ▴ 10