Question: Design and contrast question limma (additive or nested or duplicateCorrelation())??
1
3.9 years ago by
Guido Hooiveld2.4k
Wageningen University, Wageningen, the Netherlands
Guido Hooiveld2.4k wrote:

Apologies for yet again a question on how to properly define the design and contrasts in limma for an experiment... However, despite reading the docs, searching the support forum, and Googling, I am still not clear how to properly do this for my experiment.

I have array data from a mouse experiment in which the treatments were made on dams (2 diets; LFD and W), but the outcomes were measured in livers of all offspring of these dams, in both male and female pups. Obviously I would like to take the ‘dam/litter effect’ into account. Question: how to define the design and contrasts to do this?

1. Which genes are regulated in the pups by the diets of the mother?
2. Is there a diet effect in each gender (sexe)?
3. Is there a differential response between the two genders?

Initially I fitted an additive model (group means + litter), but then I faced the infamous “coefficients not estimable” warning. Still i identified genes for the above two questions, although topTable didn't like it.

However, I also realized that this is a nested design because dam/litter is nested with diets. After renumbering the litters, i fitted a ‘traditional’ model (including intercept). All coefficients were estimable, but now I am not sure which coefficients to compare to answer my research questions....

Finally, a long time ago Gordon recommended the use of duplicateCorrelation() for an experiment with *similar* design. Could this also (best) applied here?

Any feedback would be appreciated!

Thanks,

Guido

cel means additive model

Filename    Sexe    Diet    Litter
sample1    M    LFD    L37
sample2    M    LFD    L37
sample3    M    LFD    L49
sample4    M    LFD    L49
sample5    M    LFD    L50
sample6    M    LFD    L50
sample7    M    WD    L48
sample8    M    WD    L48
sample9    M    WD    L48
sample10    M    WD    L48
sample11    M    WD    L40
sample12    M    WD    L40
sample13    F    LFD    L49
sample14    F    LFD    L50
sample15    F    LFD    L37
sample16    F    LFD    L37
sample17    F    LFD    L37
sample18    F    LFD    L49
sample19    F    WD    L39
sample20    F    WD    L39
sample21    F    WD    L40
sample22    F    WD    L40
sample23    F    WD    L48
sample24    F    WD    L48

> SD <- as.factor(paste(targets$Sexe, targets$Diet, sep="."))
> Litter <- factor(targets$Litter) > > > design <- model.matrix(~0+SD+Litter) > design SDF.LFD SDF.WD SDM.LFD SDM.WD LitterL39 LitterL40 LitterL48 LitterL49 LitterL50 1 0 0 1 0 0 0 0 0 0 2 0 0 1 0 0 0 0 0 0 3 0 0 1 0 0 0 0 1 0 4 0 0 1 0 0 0 0 1 0 5 0 0 1 0 0 0 0 0 1 6 0 0 1 0 0 0 0 0 1 7 0 0 0 1 0 0 1 0 0 8 0 0 0 1 0 0 1 0 0 9 0 0 0 1 0 0 1 0 0 10 0 0 0 1 0 0 1 0 0 11 0 0 0 1 0 1 0 0 0 12 0 0 0 1 0 1 0 0 0 13 1 0 0 0 0 0 0 1 0 14 1 0 0 0 0 0 0 0 1 15 1 0 0 0 0 0 0 0 0 16 1 0 0 0 0 0 0 0 0 17 1 0 0 0 0 0 0 0 0 18 1 0 0 0 0 0 0 1 0 19 0 1 0 0 1 0 0 0 0 20 0 1 0 0 1 0 0 0 0 21 0 1 0 0 0 1 0 0 0 22 0 1 0 0 0 1 0 0 0 23 0 1 0 0 0 0 1 0 0 24 0 1 0 0 0 0 1 0 0 attr(,"assign") [1] 1 1 1 1 2 2 2 2 2 attr(,"contrasts") attr(,"contrasts")$SD
[1] "contr.treatment"

attr(,"contrasts")$Litter [1] "contr.treatment" > fit <- lmFit(x.norm,design) Coefficients not estimable: LitterL48 Warning message: Partial NA coefficients for 12332 probe(s) > cont.matrix <- makeContrasts( + WD_LFD = ( (SDF.WD + SDM.WD)/2 - (SDF.LFD + SDM.LFD)/2 ), #diet main effect + WD.F_LFD.F = (SDF.WD - SDF.LFD), #diet efefct in female pups + WD.M_LFD.M = (SDM.WD - SDM.LFD), #diet effect in male pups + delta.sexe = ((SDF.WD - SDF.LFD) - (SDM.WD - SDM.LFD)), #sexe-dep genes + levels=design) > > fit2 <- contrasts.fit(fit, cont.matrix) > fit2 <- eBayes(fit2, trend=TRUE, robust=TRUE) > topTable(fit2) Error in topTableF(fit, number = number, genelist = genelist, adjust.method = adjust.method, : F-statistics not found in fit > topTable(fit2,coef=1) logFC AveExpr t P.Value adj.P.Val B 625599_at -0.7151058 6.119586 -6.990635 2.479186e-07 0.001589408 6.373436 11303_at -0.5876455 7.568066 -6.974488 2.577697e-07 0.001589408 6.341612 20787_at -0.3881564 9.571703 -6.511793 7.979145e-07 0.002118622 5.411698 15202_at -0.9022215 4.951477 -6.781162 8.324025e-07 0.002118622 5.347300 381524_at -1.0326256 7.693721 -7.155681 8.589938e-07 0.002118622 5.269775 54712_at -0.5565909 6.665185 -6.183399 1.805718e-06 0.003489665 4.731520 15233_at -1.4009764 8.011086 -7.054168 1.980835e-06 0.003489665 4.565668 108956_at 0.5669231 8.105670 6.074023 2.934636e-06 0.004523742 4.321340 19013_at -0.4013013 8.215255 -5.681969 6.415447e-06 0.008790587 3.663809 20855_at -0.4206979 6.216767 -5.579242 8.340307e-06 0.010285266 3.441188 > targets for nested design; litter renumbered Filename Sexe Diet Litter sample1 M LFD 1 sample2 M LFD 1 sample3 M LFD 2 sample4 M LFD 2 sample5 M LFD 3 sample6 M LFD 3 sample7 M WD 3 sample8 M WD 3 sample9 M WD 3 sample10 M WD 3 sample11 M WD 2 sample12 M WD 2 sample13 F LFD 2 sample14 F LFD 3 sample15 F LFD 1 sample16 F LFD 1 sample17 F LFD 1 sample18 F LFD 2 sample19 F WD 1 sample20 F WD 1 sample21 F WD 2 sample22 F WD 2 sample23 F WD 3 sample24 F WD 3 > Diet <-relevel(as.factor(targets$Diet), ref="LFD")
> Sexe <- relevel(factor(targets$Sexe), ref="M") > Litter <- factor(targets$Litter)
>
> design <- model.matrix(~Diet+Diet:Sexe+Diet:Litter)
>
> colnames(design) <- make.names(colnames(design))
> design
X.Intercept. DietWD DietLFD.SexeF DietWD.SexeF DietLFD.Litter2 DietWD.Litter2 DietLFD.Litter3 DietWD.Litter3
1             1      0             0            0               0              0               0              0
2             1      0             0            0               0              0               0              0
3             1      0             0            0               1              0               0              0
4             1      0             0            0               1              0               0              0
5             1      0             0            0               0              0               1              0
6             1      0             0            0               0              0               1              0
7             1      1             0            0               0              0               0              1
8             1      1             0            0               0              0               0              1
9             1      1             0            0               0              0               0              1
10            1      1             0            0               0              0               0              1
11            1      1             0            0               0              1               0              0
12            1      1             0            0               0              1               0              0
13            1      0             1            0               1              0               0              0
14            1      0             1            0               0              0               1              0
15            1      0             1            0               0              0               0              0
16            1      0             1            0               0              0               0              0
17            1      0             1            0               0              0               0              0
18            1      0             1            0               1              0               0              0
19            1      1             0            1               0              0               0              0
20            1      1             0            1               0              0               0              0
21            1      1             0            1               0              1               0              0
22            1      1             0            1               0              1               0              0
23            1      1             0            1               0              0               0              1
24            1      1             0            1               0              0               0              1
attr(,"assign")
[1] 0 1 2 2 3 3 3 3
attr(,"contrasts")
attr(,"contrasts")$Diet [1] "contr.treatment" attr(,"contrasts")$Sexe
[1] "contr.treatment"

attr(,"contrasts")$Litter [1] "contr.treatment" > > fit <- lmFit(x.norm,design) > > # Intercept equals mean expression in LFD-male > > cont.matrix <- makeContrasts( + WD_LFD = ( (DietWD+DietWD.SexeF) - DietLFD.SexeF ), #diet main effect?? + WD.F_LFD.F = (DietWD.SexeF - DietLFD.SexeF), #diet efefct in female pups?? + WD.M_LFD.M = (DietWD), #diet effect in male pups?? + delta.sexe = ((DietWD.SexeF - DietLFD.SexeF) - DietWD), #sexe-dep genes?? + levels=design) > > > fit2 <- contrasts.fit(fit, cont.matrix) > fit2 <- eBayes(fit2, trend=TRUE, robust=TRUE) > topTable(fit2) WD_LFD WD.F_LFD.F WD.M_LFD.M delta.sexe AveExpr F P.Value adj.P.Val 22139_at 0.8232779 0.4584550 0.3648229 0.09363214 10.113522 38.30188 2.405464e-08 0.0002966418 17748_at -0.6849334 -0.3371893 -0.3477441 0.01055483 11.319395 30.75738 1.799528e-07 0.0011095887 69036_at -0.6605604 -0.2511031 -0.4094573 0.15835417 12.554694 29.08963 2.943635e-07 0.0012100302 20787_at -0.5784347 -0.3335203 -0.2449145 -0.08860582 9.571703 27.47740 4.828340e-07 0.0014885772 226123_at 0.6573526 0.4589262 0.1984264 0.26049983 7.369280 24.79782 1.150587e-06 0.0028378068 11303_at -0.7741369 -0.2470930 -0.5270440 0.27995100 7.568066 21.72466 3.375199e-06 0.0054942399 18703_at -0.5938379 -0.3379725 -0.2558654 -0.08210715 10.945976 21.69258 3.415043e-06 0.0054942399 68371_at -0.7809301 -0.5201257 -0.2608044 -0.25932122 9.174027 22.62103 3.564217e-06 0.0054942399 240660_at -0.5326507 -0.3816188 -0.1510319 -0.23058683 8.190418 20.47978 5.366370e-06 0.0073531199 237847_at -0.5386111 -0.4031175 -0.1354937 -0.26762380 7.751602 19.84829 6.835333e-06 0.0084293321 > > topTable(fit2,coef=1) logFC AveExpr t P.Value adj.P.Val B 22139_at 0.8232779 10.113522 8.199715 1.473255e-08 0.0001816818 9.153816 17748_at -0.6849334 11.319395 -7.494534 7.470096e-08 0.0003140067 7.766977 69036_at -0.6605604 12.554694 -7.485026 7.638826e-08 0.0003140067 7.747720 20787_at -0.5784347 9.571703 -6.899323 3.091683e-07 0.0009531658 6.533630 11303_at -0.7741369 7.568066 -6.532403 7.583463e-07 0.0018703853 5.746102 226123_at 0.6573526 7.369280 6.261047 1.487074e-06 0.0030564329 5.151217 242109_at 0.7940300 6.865846 6.180800 1.817511e-06 0.0030592229 4.973379 18703_at -0.5938379 10.945976 -6.145715 1.984575e-06 0.0030592229 4.895362 68371_at -0.7809301 9.174027 -6.057329 3.424477e-06 0.0046922949 4.420740 240660_at -0.5326507 8.190418 -5.647087 7.012635e-06 0.0077855083 3.770503 > # got stuck: both topTables for main diet effect do NOT match...  ADD COMMENTlink modified 3.9 years ago by Aaron Lun23k • written 3.9 years ago by Guido Hooiveld2.4k Answer: Design and contrast question limma (additive or nested or duplicateCorrelation() 4 3.9 years ago by Aaron Lun23k Cambridge, United Kingdom Aaron Lun23k wrote: Your diet is nested within the litter effect, so any differential expression between diets will be absorbed into the blocking factor for the litter. This is a good use case for duplicateCorrelation: design <- model.matrix(~0 + SD) colnames(design) <- levels(SD) corfit <- duplicateCorrelation(x.norm, design, block=Litter) fit <- lmFit(x.norm, design, block=Litter, correlation=corfit$consensus)

You can then test between groups to your heart's desire. To test between diets, you can do:

con <- makeContrasts(M.LFD - M.WD, levels=design) # For males
con <- makeContrasts(F.LFD - F.WD, levels=design) # For females
con <- makeContrasts((M.LFD + F.LFD)/2 - (M.WD + F.WD)/2, levels=design) # Average of both

You may or may not want to intersect the first two comparisons to find genes that change consistently between diets for both males and females. For the second question, to test between sexes, you can do:

con <- makeContrasts(M.LFD - F.LFD, levels=design) # For LFD
con <- makeContrasts(M.WD - F.WD, levels=design) # For WD
con <- makeContrasts((M.LFD + M.WD)/2 - (F.LFD + F.WD)/2, levels=design) # Average of both

Finally, to test for a differential response between sexes, you can do:

con <- makeContrasts((M.LFD - M.WD) - (F.LFD - F.WD), levels=design)

As an aside, for your second and third questions, a more elegant model would block on the litter effect directly:

> design2 <- model.matrix(~0 + Litter + SD)
> design2 <- design2[,-7]
> colnames(design2)
[1] "LitterL37" "LitterL39" "LitterL40" "LitterL48" "LitterL49" "LitterL50"
[7] "SDM.LFD"   "SDM.WD"   

This design matrix can be used directly in lmFit without running it through duplicateCorrelations first. The first 6 terms represent the average expression level of the female mice in each litter. The final two terms represent the average log-fold change for the male mice over the females, in the LFD or WD-fed litters. You can then drop either of these terms to get the sex effect for each diet, or you could compare the sex effects between diets for your third question:

con <- makeContrasts(SDM.LFD - SDM.WD, levels=design2)

This approach models the litter effect explicitly, so it avoids some of the assumptions in duplicateCorrelations. That said, the duplicateCorrelations approach that I described above will also work quite well. It's up to you whether you want to make the effort to try out this other model specifically to answer your second and third questions.

1

Hi Aaron,

Thanks very much for your reply! I fully understand your duplicateCorrelation approach. However, I am (still) puzzled by your 2nd approach in which directly is blocked on litter:

Specifically, why do you drop the 7th coefficient (SDF.WD) from the design (and not another one)?

Also, i would appreciate if you could spend few more words on when to drop which term for what contrasts, and how you would define these.

Thanks for answering these (likely naive) questions.

1

I dropped the 7th coefficient because it made the interpretation of the remaining terms easier. If I dropped SDM.WD and retained SDF.WD, the various Litter terms would represent average expression across all females for LFD litters but across males for WD litters. This is unnecessarily confusing, so I dropped SDF.WD to ensure that the Litter terms represent average expression across females for each litter.

As for the contrasts, you can just set coef=7 in topTable to test the sex effect across LFD litters. Similarly, you can set coef=8 to test the sex effect across WD litters. This makes sense, because the corresponding coefficients in design2 represent the sex effect of males over females for each diet.

Hi Aaron,

I have  doubt why dropping coef number 7 the final two terms represent the average log-fold change for male mice over female, in the LFD or WD-fed mice. The design matrix would be:

 SDM.LFD SDM.WD Filename Sexe Diet 0 0 sample13 F LFD 0 0 sample14 F LFD 0 0 sample15 F LFD 0 0 sample16 F LFD 0 0 sample17 F LFD 0 0 sample18 F LFD 0 0 sample19 F WD 0 0 sample20 F WD 0 0 sample21 F WD 0 0 sample22 F WD 0 0 sample23 F WD 0 0 sample24 F WD 1 0 sample1 M LFD 1 0 sample2 M LFD 1 0 sample3 M LFD 1 0 sample4 M LFD 1 0 sample5 M LFD 1 0 sample6 M LFD 0 1 sample7 M WD 0 1 sample8 M WD 0 1 sample9 M WD 0 1 sample10 M WD 0 1 sample11 M WD 0 1 sample12 M WD

Do the last two terms represent SDM.LFD-SDF.LFD and SDM.WD-SDF.LFD? Am I wrong?

Best,

Keifa