Entering edit mode
Mike Miller
▴
70
@mike-miller-6388
Last seen 10.6 years ago
Dear EdgeR community,
I am new to edgeR and still in the phase of reading the vignette in
details
to be able to use it for my data.
I have a question in understanding the model.matrix.
On page 27 (paragraph 3.3.2 "Nested interaction formulas"), the design
is
defined as:
> targets
Sample Treat Time
1 Sample1 Placebo 0h
2 Sample2 Placebo 0h
3 Sample3 Placebo 1h
4 Sample4 Placebo 1h
5 Sample5 Placebo 2h
6 Sample6 Placebo 2h
7 Sample1 Drug 0h
8 Sample2 Drug 0h
9 Sample3 Drug 1h
10 Sample4 Drug 1h
11 Sample5 Drug 2h
12 Sample6 Drug 2h
targets$Treat <- relevel(targets$Treat, ref="Placebo")
design <- model.matrix(~Treat + Treat:Time, data=targets)
#and the coefficient names are:
> colnames(design)
[1] "(Intercept)" "TreatDrug"
[3] "TreatPlacebo:Time1h" "TreatDrug:Time1h"
[5] "TreatPlacebo:Time2h" "TreatDrug:Time2h"
Whereas on page 28 (paragraph 3.3.4 "Interaction at any time") the
design
formula looks like this:
#I added "2" in "design2" compared to original text for easier
following:
> design2 <- model.matrix(~Treat + Time + Treat:Time, data=targets)
> colnames(design2)
[1] "(Intercept)" "TreatDrug"
[3] "Time1h" "Time2h"
[5] "TreatDrug:Time1h" "TreatDrug:Time2h"
It is explained that for the design2 (page 29 top):
"The last two coefficients give the DrugvsPlacebo.1h and
DrugvsPlacebo.2h
contrasts, so that
> lrt <- glmLRT(fit, coef=5:6)
is useful because it detects genes that respond differently to the
drug,
relative to the placebo,
at either of the times."
My question is, if I understood it well, in design2, why there are no
coefficients "TreatPlacebo:Time1h" and "TreatPlacebo:Time2h"? And
should't
"Time1h" and "Time2h" be effects of time, no matter of the
Treat(ment), and
not:
"> lrt <- glmLRT(fit, coef=3)
and
> lrt <- glmLRT(fit, coef=4)
are the e ffects of the reference drug, i.e., the effects of the
placebo at
1 hour and 2 hours" as it is written in the vignette text?
Thank you!
------------------------------------
Why I need edgeR: I have an RNASeq experiment (~30 samples), where I
need
to explore the influence of 3 factors with 2 levels each:
1. sex: f/m
2. disease_state:healthy/cancer
3. localization: blood/bones.
Question I want to answer: which genes are differentially expressed
between
2 localisations in 2 disease states (i.e. are bones more severely
affected
by cancer than blood) taking into account different sex?
I assume that my design formula should look like:
design=~sex+disease+localization+disease:localization
Could anyone please tell me if the formula is correct? And, what
should be
the output? How could I know if the disease has different effects
depending
on the localization? By number of genes affected (=differentially
expressed)?
I would appreciate very much if someone has some time to help me with
any
of the questions.
Best,
Mike
[[alternative HTML version deleted]]