Question: Multi-level experiment with limma
gravatar for luca.piacentini
2.3 years ago by
luca.piacentini0 wrote:

Dear limma team,

although similar posts, I would need a further understanding of the “Multi-level Experiments” setting with limma.  My experiment design includes paired samples and two factors (Condition and Tissue) similarly to the example of the section 9.7 of the limma User Guide (Last revised 16 October 2016) but with the Condition variable that is unbalanced (25% High and 75% Low; see below).

FileName            Subject                 Condition            Tissue

File01                    1                             High                       A

File02                    1                             High                       B

File03                    2                             High                       A

File04                    2                             High                       B

File05                    3                             High                       A

File06                    3                             High                       B

File07                    4                             Low                       A

File08                    4                             Low                       B

File09                    5                             Low                       A

File10                    5                             Low                       B

File23                    12                           Low                       A

File24                    12                           Low                       B


Basically, I followed the example code of the User Guide and it seems to work properly. However, I didn’t understand exactly how the duplicateCorrelation function may take into account for a paired design with two factors. Please, can you give me, in short, the correct interpretation of the duplicateCorrelation function and how it works?

Moreover, I would enquire whether there is some trick to use the variable “Condition” as a continuous variable in place of the dichotomous (High/Low) variable.

Thanks for your help.


ADD COMMENTlink modified 2.3 years ago by Aaron Lun25k • written 2.3 years ago by luca.piacentini0
Answer: Multi-level experiment with limma
gravatar for Aaron Lun
2.3 years ago by
Aaron Lun25k
Cambridge, United Kingdom
Aaron Lun25k wrote:

Presumably, your design matrix only contains the factors Condition and Tissue. By itself, this would fail to account for the fact that your samples are paired by subject. The subsequent linear model would exhibit correlations in the residuals of paired samples, due to the presence of subject-specific effects. For example, if something happened to subject 1 that resulted in lower expression of a particular gene, the residuals would be negative for both File01 and File02. These correlations are bad because they interfere with statistical inference from the linear model. In particular, limma will think that there is more information in the experiment than there actually is, because it will assume that all observations are independent. In practice, this usually results in anticonservativeness as the uncertainty in the coefficient estimates is understated.

This is where duplicateCorrelation comes into play. It estimates the correlation for all genes due to this subject-specific effect; roughly speaking, this is the proportion of the variance that is explained by putting Subject into the model. Estimation of the correlation is done with some care to share information across genes to yield a consensus estimate, which is more stable (and useful) than a gene-specific estimate. lmFit will then use the consensus correlation in generalized least squares, which fits the linear model in a manner that accounts for the Subject blocking factor and the associated correlation.

Finally, if you want to use Condition as a continuous variable - just do it, model.matrix will happily accept real-valued covariates.

ADD COMMENTlink modified 2.3 years ago • written 2.3 years ago by Aaron Lun25k

Dear Aaron,

thanks for your quick reply! I understood a little bit better the meaning underlying the duplicateCorrelation function.

For what concern my second issue, actually, I tried to use Condition as a continuous variable in place of High/Low factorial variable. Whereas I easily drew a contrast matrix for the first design (i.e. Condition and Tissue as factors) as follows:

Treat<-factor(paste(mydata$Condition, mydata$tissue, sep="."))

design<- model.matrix(~0+Treat)

# colnames(design) are: High.TissueA, High.TissueB, Low.TissueA, Low.TissueB

cm<- makeContrasts(

TissueA_High.vs.TissueA_Low=High.TissueA-Low.TissueA, # which genes respond to Condition in TissueA

TissueB_High.vs.TissueB_Low=High.TissueB-Low.TissueB, # which genes respond to Condition in TissueB

TissueA.vs.TissueB_High=High.TissueA-High.TissueB, # which genes respond to Tissue in High

TissueA.vs.TissueB_Low=Low.TissueA-Low.TissueB, # which genes respond to Tissue in Low

(TissueA.vs.TissueB_High)_VS_(TissueA.vs.TissueB_Low)=(High.TissueA-High.TissueB)-(Low.TissueA-Low.TissueB), # which genes respond differently in High vs Low (interaction)



I had some doubt about how to correctly draw and interpret the contrast matrix when I substituted the Condition as factor with the Condition as continuous values. In other words, how can I contrast an interaction term to identify genes that respond differently to Condition (continuous) in TissueA vs TissueB? 

Thanks again!


ADD REPLYlink written 2.3 years ago by luca.piacentini0

You need to do something like this:

Tissue <- rep(LETTERS[1:2], each=5)
Condition <- runif(10)
design <- model.matrix(~0 + Tissue + Tissue:Condition)

This effectively gives an intercept and a gradient term for each level of Tissue, where the gradient describes the change in the log-expression values with respect to increasing values of Condition. To determine whether the condition effect is the same in each tissue, you can just test for equality of the gradient terms via makeContrasts.

ADD REPLYlink modified 2.3 years ago • written 2.3 years ago by Aaron Lun25k

There are multiple ways to effectively model the matrix, this is great.

Thanks again Aaron. 

ADD REPLYlink written 2.3 years ago by luca.piacentini0
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 198 users visited in the last hour