Question: limma random effects and multi level experiments
0
5 weeks ago by
David Rengel70
European Union
David Rengel70 wrote:

Hi,

I have a question regarding limma-driven microarray analysis.

I have a total of 24 arrays. My experimental design consists in two fixed factors, i.e. "matrix" (F or G) and "dim" (2 or 3), and what I consider a Random Factor, i.e. "donor". There are six donors, so that each donor has provided a matrix * dim treatment. That is: 4 possible combinations matrix * dim * 6 donors = 24 arrays.

I run the following code:

    eset <-new ("ExpressionSet", exprs = as.matrix(data.norm))
treat <- factor (substr(colnames(data.norm),1,2))
design <- model.matrix (~ 0 + treat)
colnames(design) <- c("F2","F3","G2","G3")
corfit <- duplicateCorrelation (eset, design, block = factor(substr(colnames(data.norm),4,6)))
corfit$consensus fit <- lmFit(eset, design, block = factor (substr(colnames(data.norm),4,6)), correlation = corfit$consensus)

cont.matrix <- makeContrasts(
F3.F2 = F3 - F2,
G3.G2 = G3 - G2,
G2.F2 = G2 - F2,
G3.F3 = G3 - F3,
inter = (G3 - G2) - (F3 - F2),
levels = design
)

fit2 <- contrasts.fit (fit, cont.matrix)
eb <- eBayes(fit2)


My main question is: is it right to run such code? I say this because on page 50 of the limma user's guide, it says regarding multi level designs that "Since we need to make comparaisons BOTH within AND between subjects, it is necessary to treat Patient as a Random Effect" (capital letters are mine). This is because in the example given in the document, patients were either "Normal" or "Diseased", but not both. In my design, however, a single donor covers all fixed effect combinations. I still consider my donors as random, but I wonder if the limma code I used is adapted.

A subsidiary question would deal with the fact that when running lmer models on every gene some of them present singular fits, that is over fitting. This is not visible when running the code shown above. I am more used to run lme or lmer models out of the transcriptomics studies, where thousands of models, i.e. one per gene, are fitted.

Best, David

modified 27 days ago • written 5 weeks ago by David Rengel70

Hi David,

To me, it seems like you were using sample ID (i.e. column names) instead of the donor ID for the blocking variable in the duplicateCorrelation. Perhaps I am wrong. Other than that, I do not see any problem with your code.

Set up a metadata table. Otherwise we're just guessing what your code actually does.

Answer: limma random effects and multi level experiments
0
4 weeks ago by
Gordon Smyth37k
Walter and Eliza Hall Institute of Medical Research, Melbourne, Australia
Gordon Smyth37k wrote:

The whole purpose of treating a blocking variable as "random" in a linear model is to recover inter-block information about the treatments. For a balanced design like yours, in which every donor receives every treatment, there is no inter-block information to recover because all the blocks receive the same treatments.

Hence, for your design, there is no point is running duplicateCorrelation, or lmer for that matter. Just include donor as an additive factor in the design matrix. Then the treatments will be compared within donors. Simple and robust.

The situation is very much analogous to paired t-tests, just with four treatments per donor instead of two.

Hi Gordon, Thanks again for your suggestion. I have tried to implement it but I came across with this issue: the "reference" level of my treatment disappears under “donor + treatment” design. It obviously stays if I implement a “treatment + donor” design. I mean "reference" level because I do not want to make any reference level explicit. This is why I had chosen a design matrix with no intercept. Please find here below my code:

eset <- new ("ExpressionSet", exprs = as.matrix(data.norm))


I may choose one of these three designs:

design_intercept <-  model.matrix (~ design.exp$donor + design.exp$treat)
design_dt <- model.matrix (~ 0 + design.exp$donor + design.exp$treat)
design_td <- model.matrix (~ 0 + design.exp$treat + design.exp$donor)


If I chose "design_dt", I'd continue as follows:

colnames(design_dt) <- sub("design.exp.", "", colnames(design_dt))

fit <- lmFit(eset, design)

cont.matrix <- makeContrasts(
F3.F2 = treatF3 - treatF2,
G3.G2 = treatG3 - treatG2,
G2.F2 = treatG2 - treatF2,
G3.F3 = treatG3 - treatF3,
inter = (treatG3 - treatG2) - (treatF3 - treatF2),
levels = design_dt)


But treatF2 does not exist as such anymore, and I do not wish F2 (nor any other level) to be the reference. Any hint on this would be much appreciated. Best,

David

...Continued from previous comment (above)

In others termes: it seems that by doing donor + treat, there are too many parameters to be estimated and one of them is left out. I hav obviously found myself in this scenario before but I do not recall it in a situation where I need to build contrasts without a reference. Is there any way around it? Thanks again. David

1

Using a treatment contrasts parameterization you will always have a baseline in this model. Generally you would do

> df <- data.frame(treat = rep(c("F2","F3","G2","G3"), 6), donor = gl(4,6))
> df
treat donor
1     F2     1
2     F3     1
3     G2     1
4     G3     1
5     F2     1
6     F3     1
7     G2     2
8     G3     2
9     F2     2
10    F3     2
11    G2     2
12    G3     2
13    F2     3
14    F3     3
15    G2     3
16    G3     3
17    F2     3
18    F3     3
19    G2     4
20    G3     4
21    F2     4
22    F3     4
23    G2     4
24    G3     4
> model.matrix(~0+treat+donor, df)
treatF2 treatF3 treatG2 treatG3 donor2 donor3 donor4
1        1       0       0       0      0      0      0
2        0       1       0       0      0      0      0
3        0       0       1       0      0      0      0
4        0       0       0       1      0      0      0
5        1       0       0       0      0      0      0
6        0       1       0       0      0      0      0
7        0       0       1       0      1      0      0
8        0       0       0       1      1      0      0
9        1       0       0       0      1      0      0
10       0       1       0       0      1      0      0
11       0       0       1       0      1      0      0
12       0       0       0       1      1      0      0
13       1       0       0       0      0      1      0
14       0       1       0       0      0      1      0
15       0       0       1       0      0      1      0
16       0       0       0       1      0      1      0
17       1       0       0       0      0      1      0
18       0       1       0       0      0      1      0
19       0       0       1       0      0      0      1
20       0       0       0       1      0      0      1
21       1       0       0       0      0      0      1
22       0       1       0       0      0      0      1
23       0       0       1       0      0      0      1
24       0       0       0       1      0      0      1


In which case the baseline is donor 1, and you can make comparisons between treatments to your heart's content.

Thanks a lot James. That was very helpful. Including the gl function, of which I was unaware! Best. David

Answer: limma random effects and multi level experiments
0
27 days ago by
David Rengel70
European Union
David Rengel70 wrote:

Thanks Mikhael, Aaron and Gordon for your kind replies. I apoligize for my late own reply. I will proceed as Gordon suggests. Best, David.