Question: limma, reference group, covariate
1
gravatar for alakatos
4.6 years ago by
alakatos100
United States
alakatos100 wrote:

Hello, 

I would like to use “limma” for my Affy gene expression analysis.  My dataset is in “expressionset”  format. 

I would like  to include two variables in my design matrix: 4-level treatment and 2-level sex. The goal is to test the treatment groups against “WT" while controlling for sex ; (treatment effect independent from sex). 

Dataset:

> class(ExpressionSet)
[1] "ExpressionSet"
attr(,"package")
[1] “Biobase" 

Includes expression data, phenoData and annotation. 

Execution:

pd <- pData(phenoData(ExpressionSet)) 
design <- model.matrix(~0+Treatment+Sex, pd, ref="WT")
colnames(design) <- gsub("Treatment|Sex", "", colnames(design))
fit <- lmFit(exprs(ExpressionSet), design)
ebFit <- eBayes(fit)

Design 
                    S_S   S_Veh   WT_S            WT            M
1112F-1         0       1          0                   0               0
1112F-2          0       1         0                  0                1
1112F-3          0       1         0                  0                1
1112F-4          0       1         0                  0                0
1112F-5          0       0         1                  0                 0 
Etc..

[1] 1 1 1 1 2
attr(,"contrasts")
attr(,"contrasts”)$Treatment
[1] "contr.treatment"
attr(,"contrasts")$Sex
[1] "contr.treatment"

It seems  “WT” is not recognized as reference. My understanding is that  I should not see “WT" in the design matrix if it is a reference.  

1. Would you please advise how to fix the reference?

2. I am also wondering if my setup  is correct for testing treatment and controlling for sex.  

3. What do I test without  the reference group in the design above? What is the interpretation of for instance, the result of  coef=1  (tt <- topTable(ebFit, number=10, coef=1, adjust = "fdr”) when  no (ref= ) in the design. 

Thanks a lot for your help and time.

Anita

ADD COMMENTlink modified 4.6 years ago • written 4.6 years ago by alakatos100
Answer: limma, reference group, covariate
3
gravatar for James W. MacDonald
4.6 years ago by
United States
James W. MacDonald50k wrote:

The model you have fit is 'cell means' model, and it is estimating the means for each treatment group. This is a perfectly acceptable parameterization, but you have an extra step of fitting contrasts.
 

contrasts <- makeContrasts(S_S-WT, S_Veh-WT, WT_S-WT, levels = design)

fit2 <- contrasts.fit(fit, contrasts)

ebFit <- eBayes(fit2)

And at this point

topTable(ebFit, coef=1)

will output the top 10 genes for the S_S vs WT contrast.

You could always do what I think you want by excluding the '0' from your call to model.matrix (see ?formula for more information). But do note that there is no argument 'ref' for model.matrix (or any underlying functions), so that will have no effect on what the reference level is.

To answer your other questions, yes your model is correct for testing treatment and controlling for sex. But do note that this might not be a biologically reasonable thing to do. By fitting this model you are in essence assuming that the differences between sexes is a simple shift in expression levels. In other words, you assume that for a given gene (for example) that if the S_S treatment in females has a higher expression than WT, that this same pattern holds for males, but with perhaps higher or lower overall expression levels. This will miss (or more correctly ignore) any situations where e.g., S_S treatment is up-regulated as compared to WT in females, but down-regulated in males (or any other dis-similar reaction to S_S treatment that is based on sex). To capture that sort of thing you need an interaction term.

So one could argue that you should at least do (here I just fake up the Treatment and Sex factors as an example):

> Treatment <- factor(rep(c("S_S","S_Veh","WT_S", "WT"), each = 4))
> Sex <- factor(rep(c("Male","Female"), each=2, times=4))
> Treatment <- relevel(Treatment, ref = "WT")
> design <- model.matrix(~Treatment*Sex)
> colnames(design)
[1] "(Intercept)"            "TreatmentS_S"           "TreatmentS_Veh"        
[4] "TreatmentWT_S"          "SexMale"                "TreatmentS_S:SexMale"  
[7] "TreatmentS_Veh:SexMale" "TreatmentWT_S:SexMale"

And then you could do

fit <- lmFit(ExpressionSet, design)
fit2 <- eBayes(fit)
topTable(fit2, 6:8)

Which will then tell you if you have any genes for which the interaction is significant for any of the treatments. If there are none, you are probably safe ignoring the interaction and proceeding with what you were doing initially.

ADD COMMENTlink written 4.6 years ago by James W. MacDonald50k
Answer: limma, reference group, covariate
1
gravatar for Gordon Smyth
4.6 years ago by
Gordon Smyth37k
Walter and Eliza Hall Institute of Medical Research, Melbourne, Australia
Gordon Smyth37k wrote:

You appear to have confused the functions model.matrix() and modelMatrix(). The latter has a 'ref' argument and is only for two-color data. The former does not have a 'ref' argument and is appropriate for single channel data, as you have here.

It is a good idea to get familiar with help system, which will tell you what the arguments are for each function. Try typing ?model.matrix at the R prompt.

ADD COMMENTlink modified 4.6 years ago • written 4.6 years ago by Gordon Smyth37k
Answer: limma, reference group, covariate
0
gravatar for alakatos
4.6 years ago by
alakatos100
United States
alakatos100 wrote:

Hi James,

Thank you very much for the detailed answer and clarification. 

All the best, 

Anita 

 

ADD COMMENTlink written 4.6 years ago by alakatos100
Answer: limma, reference group, covariate
0
gravatar for alakatos
4.6 years ago by
alakatos100
United States
alakatos100 wrote:

Ok. I will.  Thank you.

 

 

ADD COMMENTlink written 4.6 years ago by alakatos100
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 322 users visited in the last hour