Search
Question: question about the EdgeR package: additive models and blocking
0
gravatar for Fix Ace
2.6 years ago by
Fix Ace100
United States
Fix Ace100 wrote:

1. In the manual, section 3.5, comparison both between and within subjects
If I have the similar design, but my question is only to find genes responding to the hormone, may I skip the patient item in design matrix:

Like instead of using:

design <- model.matrix(~Disease+Disease:Patient+Disease:Treatment)

only use: 

design <- model.matrix(~Disease+Disease:Treatment) 

And a relevant question:

2. if I have a design matrix: 

design <- model.matrix(~0+Disease+Patient+Treatment)

are the coefficients still equivalent to the expression level of each group? I feel some levels in Patient or Treatment are missing

Thanks a lot:)

ADD COMMENTlink modified 2.6 years ago by Gordon Smyth32k • written 2.6 years ago by Fix Ace100
1
gravatar for Gordon Smyth
2.6 years ago by
Gordon Smyth32k
Walter and Eliza Hall Institute of Medical Research, Melbourne, Australia
Gordon Smyth32k wrote:

1. No you can't remove Patient from the model, and why would you want to? The design in Section 3.5 of the edgeR User's Guide is a paired experiment with the two treatments given to each patient. The whole point of designing a paired experiment is so that you can gain statistical power by adjusting for baseline differences between the patients. Removing Patient from the model throws that all away.

2. No, the coefficients don't estimate expression in each group, because you don't have distinct groups. The only context when you should use "0+" in the model formula is for the simple one-way layout situation described in Section 3.2.3 of the edgeR User's Guide. In almost all other situations you should not use "0+".

The default parametrization of linear models in R is to compare each level of a factor back to the first level of that factor. You won't see a coefficient for the first level of each factor because you can't compare the first level to itself. This is a frequently asked question; see for example: limma modeling, paired samples: first disease type disappears from design matrix

ADD COMMENTlink modified 2.6 years ago • written 2.6 years ago by Gordon Smyth32k
0
gravatar for Fix Ace
2.6 years ago by
Fix Ace100
United States
Fix Ace100 wrote:

Thank you very much for the reply.

My next question is: if I use

> design <- model.matrix(~Disease+Disease:Patient+Disease:Treatment)
> colnames(design)
 [1] "(Intercept)" "DiseaseDisease1"
 [3] "DiseaseDisease2" "DiseaseHealthy:Patient2"
 [5] "DiseaseDisease1:Patient2" "DiseaseDisease2:Patient2"
 [7] "DiseaseHealthy:Patient3" "DiseaseDisease1:Patient3"
 [9] "DiseaseDisease2:Patient3" "DiseaseHealthy:TreatmentHormone"
[11] "DiseaseDisease1:TreatmentHormone" "DiseaseDisease2:TreatmentHormone"

I run glmLRT, how do I interpret the coefficients other than 10:12

for example:

fit <- glmFit(y, design)

Then

lrt <- glmLRT(fit, coef=2:3)

or 

lrt<-glmLRT(fit, coef=4:6)

May I say that coef=2:3, is to find the genes whose changes are resulting from the diseases?

and how about coef=4:6?

Many thanks again!

ADD COMMENTlink modified 2.6 years ago by Gordon Smyth32k • written 2.6 years ago by Fix Ace100
0
gravatar for Gordon Smyth
2.6 years ago by
Gordon Smyth32k
Walter and Eliza Hall Institute of Medical Research, Melbourne, Australia
Gordon Smyth32k wrote:

"May I say that coef=2:3, is to find the genes whose changes are resulting from the diseases? and how about coef=4:6?"

No, you can't. None of the coefficients other than 10:12 have any useful meaning, because they are basically simply a way to represent the baseline levels of the patients.

The short answer to all your questions so far is that you cannot easily deviate from the analysis described in Section 3.5 of the edgeR User's Guide.

Basically, the requirement that you need to adjust for baseline differences between the patients when testing for hormone differences makes it impossible to test for baseline differences groups of patients, for example to test disease vs healthy, using this type of analysis.

If you really do want to find DE genes for diseased patients vs healthy patients without hormone or diseased vs healthy with hormone, then there are two choices: (i) using edgeR, you can subset your data and do separate analyses of the without hormone and the with hormone samples, or (ii) switch to limma-voom, which has the capability of doing multilevel analyses of experiments like this.

ADD COMMENTlink modified 2.6 years ago • written 2.6 years ago by Gordon Smyth32k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.2.0
Traffic: 269 users visited in the last hour