Question: Limma --- linear model type I and type III SS problem
0
3.3 years ago by
Belgium
ashley.lu010 wrote:

Dear all,

I would like to use Limma package to perform differential expression analysis on our RNA seq data, because of the flexibly of linear model.

However, as I increase the complexity of the model or the question , i am having troubles to interpret our model.

We have 3 factors and 1 continuous factor:

Age : Young and Old

Genotype: Transgenic and Wildtype

Strain: Strain A and Strain B

Transcription factor expression (continuous).

We are interested in

1. What is the age effect on the RNA expression

2. what is age and genotype interaction effect on the RNA expression

3. is there a difference between Strain A and Strain B in age*genotype effects.

4. Does the expression level of the transcription factor have an effect on RNA expression.

For the first 3 questions, I fitted a 3 way factorial model:

Age*Genotype*Strain

For the 4th question, i fitted an ANCOVA type of model:

Age+genotype+Age:genotype+Transcription_Factor_Expression

1. I noticed that when fitting a 3 way factorial model, topTable outputs are different when i can the order of the factors. Does this mean limma uses a type I model, and the interpretation of the model will depends on which factor put in first as the main effect?
2. When fitting the Transcription factor expression into the model, then i notice the order of the model doesn’t matter anymore. Does this mean limma will now treat them as a type III model when a continuous factor is fitted?

In addition, the transcription factor expression is probably not independent from the age*genotype effect,so if i fit the model  Age+genotype+Age:genotype+Transcription_Factor_Expression,

can I interpret the Transcription_Factor expression as the effect explained on top of the age*genotype effect instead of the the effect separated from age*genotype effect?

Thank you,

Ashley

modified 3.3 years ago by Gordon Smyth37k • written 3.3 years ago by ashley.lu010
Answer: Limma --- linear model type I and type III SS problem
1
3.3 years ago by
Aaron Lun24k
Cambridge, United Kingdom
Aaron Lun24k wrote:

The order of factors will not affect the design, but it'll affect the ordering of the model coefficients. Thus, you have to be careful of what comparisons you're actually doing when you drop coefficients in topTable. It's hard to say without seeing some code, but I'll guess that you're dropping the same coefficient (in terms of column number) between models; this may result in a different comparison when you change the order of factors.

For your second question, when you add the TF expression covariate, you're probably adding it as the last term in the model, and you're probably testing for a TF effect by dropping the last term. This means that the order of the first three factors doesn't matter, as the last term in the model is always the TF expression covariate.

Finally, it's not clear what your distinction is between "on top of" and "separated from". The value of the coefficient represents the increase in log-expression per unit of TF expression - this definition is applicable across all samples, so in that sense, it's separate from the age/genotype level of each sample. However, the TF expression term is an additive factor, so its effect gets added "on top of" the age/genotype effects for each sample.

Edit: Some errors in my original answer were pointed out by Gordon; these have been fixed up.

ADD COMMENTlink modified 3.3 years ago • written 3.3 years ago by Aaron Lun24k
Answer: Limma --- linear model type I and type III SS problem
0
3.3 years ago by
Gordon Smyth37k
Walter and Eliza Hall Institute of Medical Research, Melbourne, Australia
Gordon Smyth37k wrote:

Actually you are mistaken. For the models you have fitted, the order that the factors appear in the model never matters, regardless of whether you include the continuous factor or not. The limma results remain the same regardless of the order you give the factors in.

limma always tests for each factor or covariate adjusted for all other terms in the models. This is essentially equivalent to SAS Type III. So, yes, you are testing for the effect of the covariate over and above that of the other factors. You are testing whether the TF expression explains something more not already explained by age and genotype.

ADD COMMENTlink modified 3.3 years ago • written 3.3 years ago by Gordon Smyth37k