Entering edit mode
V. Oostra
▴
30
@v-oostra-4131
Last seen 10.3 years ago
Dear list,
I'm analysing my full factorial experiment in R / Maanova and am
having some trouble using "matest" to evaluate which genes show a
significant interaction effect, and in particular how to use the
contrast matrix to compare specific treatment groups with one another.
I hope this is the right place to ask, and I apologise if this has
been asked before.
I'm analysing 46 samples hybridized to custom designed one-colour
Nimblegen arrays. I have two factors: "temperature" (2 levels; 20 or
25) and "age" (3 levels; 20, 50 or 90), in a full factorial design,
with 7 to 8 biological replicates and (at this stage) no random
variables. My response variables are 15,830 probe-summarised, quantile
normalised expression values (log scale). For details and session info
see below the email.
I'm interested in which genes show a significant temperature x age
interaction, and of those genes, which genes show, for each of the
three age levels separately, a significant effect of temperature. Of
the genes that do not show a significant interaction, I want to know
whether the main effects are significant, i.e. which genes show an
overall effect of temperature or age.
This is my model:
>fit1<-fitmaanova(mydata,formula=~temperature+time.point+temperature:t
ime.point)
the labels of the treatment groups are as follows:
> fit1[10]
$`temperature:time.point.level`
[1] "20:20" "20:50" "20:90" "25:20" "25:50" "25:90"
question 1: Am I coding my variables correct? Or should I make a new
variable 'treatment group' that with all 6 combinations of the 2
biological factors temperature and age? (as is done in the R / maanova
tutorial)
Then I want to test the interaction term:
> int<-matest(mydata,fit1,term="temperature:time.point",
test.type = "ftest", test.method=c(1,1,1,1), n.perm=2000)
I know how to access F and p values in int$F1 and int$Fs, but I'm not
sure exactly how to interpret them. The contrast matrix int$Contrast
suggests that a series of pairwise comparisons was performed:
> int$Contrast
[,1] [,2] [,3] [,4] [,5] [,6]
[1,] 1 0 -1 -1 0 1
[2,] 0 1 -1 0 -1 1
question 2: Is this matest giving me indeed the F and p values for the
interaction? Or for the two contrasts specified in int$Contrast? For
univariate analyses, I normally use
anova(lm(y~factor1+factor2++factor:factor2)) to obtain the F and p
values for main effects and interaction in case of such a two-way
model.
Ideally, I'd like to use p values for the interaction (with a cut-off
of, say, 0.1) to divide the set of the genes in two: genes with and
genes without a significant interaction effect. For the first group
I'd like to test, for each of the three age levels separately, which
genes show a significant effect of temperature. For the second I'd
like to look at the two main effects.
Question 3: is this a sensible approach? Or should I always include
all genes in the analyses, and only afterwards compare the genesets of
the different tests? E.g. intersect the set that has a significant
temperature x age interaction with a set that shows a difference
between the 2 temperatures at the first age class? What is the best
way to compare the two temperatures within each of the three age
classes?
With the full data set, I continued to analyse, for each of the three
age levels separately, which genes show a significant effect of
temperature. I tried to make a contrast matrix C with one row for each
the contrasts I care about:
> C
[,1] [,2] [,3] [,4] [,5] [,6]
[1,] 0 0 1 0 0 -1
[2,] 0 1 0 0 -1 0
[3,] 1 0 0 -1 0 0
However, when I use matest with this contrast matrix
> int<-matest(e,fit1,term="temperature:time.point", Contrast = C,
+ test.type = "ttest", test.method=c(1,1,1,1), n.perm=10)
It gives this error:
Error: The number 1 test is not estimable
> traceback()
3: stop(paste("The number", i, "test is not estimable"), call. =
FALSE)
2: checkContrast(model, term, Contrast)
1: matest(e, fit1, term = "temperature:time.point", Contrast = C,
test.type = "ttest", test.method = c(1, 1, 1, 1), n.perm = 10)
I understand that I'm not testing both temperature and age in each
contrast (each row of the matrix), and therefore perhaps should not
use "temperature:time.point" as term in matest. But I'm not sure how
to compare the two temperatures within each of the three age classes.
Thanks a lot in advance for any input.
Cheers,
Vicencio
> sessionInfo()
R version 2.11.1 (2010-05-31)
i386-pc-mingw32
locale:
[1] LC_COLLATE=English_United States.1252
[2] LC_CTYPE=English_United States.1252
[3] LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C
[5] LC_TIME=English_United States.1252
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] lattice_0.18-8 maanova_1.20.0 Biobase_2.8.0 limma_3.4.1
loaded via a namespace (and not attached):
[1] grid_2.11.1 tools_2.11.1
> mydata
Summary for this experiment
Number of dyes: 1
Number of arrays: 46
Number of genes: 15830
Number of replicates: 1
Transformation method: None
Replicate collapsed: FALSE
> table(mydata$design[,18:19])
time.point
temperature 20 50 90
20 8 8 8
25 8 7 7
> fit <-fitmaanova(mydata,formula=~temperature+time.point+temperature:
time.point)
> int <-matest(mydata,fit1,term="temperature:time.point",
test.type = "ftest", test.method=c(1,1,1,1), n.perm=2000)
> names(fit1)
[1] "probeid" "yhat"
[3] "S2" "G"
[5] "temperature" "temperature.level"
[7] "time.point" "time.point.level"
[9] "temperature:time.point" "temperature:time.point.level"
[11] "model" "subCol"
> fit1[6]
$temperature.level
[1] "20" "25"
> fit1[8]
$time.point.level
[1] "20" "50" "90"
> fit1[10]
$`temperature:time.point.level`
[1] "20:20" "20:50" "20:90" "25:20" "25:50" "25:90"
[[alternative HTML version deleted]]