Question: Random effects and variance components (was: Yet another nested design in limma)

0

Paolo Innocenti •

**320**wrote:Dear Gordon and list,
thanks for the previous help, it was indeed helpful. Nonetheless, even
after some strolling (and independent trials-and-errors), I am still
stuck on this issue:
we came to the conclusion that the simplest good model for our affy
experiment is the following:
design <- model.matrix(~ sex*line, data=pData(data))
sex: M/F,
line: 15 levels (different clones)
8 biological replicates for each line (4 for each sex)
The issue here is that the "line" factor should be treated as random.
First, because each line is a haplotype randomly picked from a large
outbred population. Second, because we would like to estimate, from
the
variance components, the heritability of the transcript (the variance
explained by the "line" term can be approximated to the genetic
variance).
Gordon Smyth wrote:
"Finally, you can add the biolrep as a random effect using the
duplicateCorrelation() function with block argument, as explained in
the
limma User's Guide, but I am not convinced yet that this is absolutely
necessary for your experiment."
I am not sure I understand what you mean here, but the random effect I
want to include is "line", not the biological replicate (that is
numbered 1:4 in each line/sex, but it says nothing about relationships
between each 1, each 2, etc...)
So, the two questions are:
1) How to treat "line" as random. I appreciate that this issue is
explained in chapter 8.2 of the limma user's guide, but I still don't
get how to fit the interaction and how to include my random effect in
my
contrast, or in my toptable. For instance:
design <- model.matrix(~ sex*line, data=pData(data))
randomline <- duplicateCorrelation(eset, design,
block=rep(1:15,each=8))
fit <- lmFit(eset, design, block=rep(1:15, each=8),
cor=randomline$consensus)
I get this error:
Error in chol.default(V) :
the leading minor of order 2 is not positive definite
Probably because the block is exactly the same vector as the line,
already included in the design.
On the other hand, if I fit:
design <- model.matrix(~ sex, data=pData(data))
randomline <- duplicateCorrelation(eset, design,
block=rep(1:15,each=8))
fit <- lmFit(eset, design, block=rep(1:15, each=8),
cor=randomline$consensus)
There is no line:sex interaction, and the only effect I can obtain is
the effect of sex.
How can I get the effect of the line, and of the sex:line interaction?
2) Where can I find the variance components of my random effect?
Thanks in advance for any help,
paolo
>> sessionInfo()
> R version 2.9.0 (2009-04-17)
> x86_64-unknown-linux-gnu
>
> locale:
> LC_CTYPE=en_US.UTF-8;LC_NUMERIC=C;LC_TIME=en_US.UTF-8;LC_COLLATE=en_
US.UTF-8;LC_MONETARY=C;LC_MESSAGES=en_US.UTF-8;LC_PAPER=en_US.UTF-8;LC
_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF-8;LC_IDEN
TIFICATION=C
>
> attached base packages:
> [1] tcltk stats graphics grDevices utils datasets
methods
> [8] base
>
> other attached packages:
> [1] statmod_1.4.0 qvalue_1.18.0 limma_2.18.1 affy_1.22.0
Biobase_2.4.1
> [6] maanova_1.14.0
>
> loaded via a namespace (and not attached):
> [1] affyio_1.12.0 preprocessCore_1.6.0 tools_2.9.0
Gordon K Smyth wrote:
> Dear Paolo,
>
> As Naomi Altman as already told you, analysing an experiment such as
> this is straightforward with limma. I guess the problem you are
having
> is that you are trying to use the limma User's Guide's suggestion of
> forming a composite factor out of the individual factors (called the
> group means parametrization), and you don't know how to define
contrasts
> for interactions from this factor. This does become a little more
> involved for experiments with more factors. Can I suggest that you
> instead make use of the factorial formulae in R when you make up the
> design matrix, then you can probably dispense with the contrast step
> altogether.
>
> You could for example use
>
> targets <- read.delim("targets.txt")
> design <- model.matrix(~Batch+Sex*(Phen/Line), data=targets)
>
> This will produce a design matrix with the following columns.
>
> > colnames(design)
> [1] "(Intercept)" "Batch" "SexM"
> [4] "PhenH" "PhenL" "PhenA:Line"
> [7] "PhenH:Line" "PhenL:Line" "SexM:PhenH"
> [10] "SexM:PhenL" "SexM:PhenA:Line" "SexM:PhenH:Line"
> [13] "SexM:PhenL:Line"
>
> To find genes significant for the sex x line interaction, you can
simply
> use
>
> fit <- lmFit(eset, design)
> fit <- eBayes(fit)
> topTable(fit, coef=9:13)
>
> On the other hand,
>
> topTable(fit, coef=9:10)
>
> is the sex x phen interaction.
>
> Finally, you can add the biolrep as a random effect using the
> duplicateCorrelation() function with block argument, as explained in
the
> limma User's Guide, but I am not convinced yet that this is
absolutely
> necessary for your experiment.
>
> Can I also suggest that you stroll over to the mathematics
department at
> Uppsala and talk to someone interested in bioinformatics and
microarray
> analysis, say Professor Tom Britton, and see if you can get ongoing
help
> with statistics and design issues.
>
> Best wishes
> Gordon
>
>
>> Date: Mon, 04 May 2009 14:09:14 +0200
>> From: Paolo Innocenti <paolo.innocenti at="" ebc.uu.se="">
>> Subject: Re: [BioC] Yet another nested design in limma
>> Cc: AAA - Bioconductor <bioconductor at="" stat.math.ethz.ch="">
>>
>> Hi all,
>>
>> since I received a few emails in my mailbox by people interested in
a
>> solution for this design (or a design similar to this one), but
there is
>> apparently no (easy) solution in limma, I was wondering if anyone
could
>> suggest a package for differential expression analysis that allows
>> dealing with:
>>
>> nested designs,
>> random effects,
>> multiple factorial designs with more than 2 levels.
>>
>> I identified siggenes, maanova, factDesign that could fit my needs,
but
>> I would like to have a comment by someone with more experience
before
>> diving into a new package.
>>
>> Best,
>> paolo
>>
>>
>>
>> Paolo Innocenti wrote:
>>> Hi Naomi and list,
>>>
>>> some time ago I asked a question on how to model an experiment in
limma.
>>> I think I need some additional help with it as the experiment grew
in
>>> complexity. I also added a factor "batch" because the arrays were
run in
>>> separate batches, and I think would be good to control for it.
>>> The dataframe with phenotypic informations ("dummy") looks like
this:
>>>
>>> >> Phen Line Sex Batch BiolRep
>>> >> File1 H 1 M 1 1
>>> >> File2 H 1 M 1 2
>>> >> File3 H 1 M 2 3
>>> >> File4 H 1 M 2 4
>>> >> File5 H 1 F 1 1
>>> >> File6 H 1 F 1 2
>>> >> File7 H 1 F 2 3
>>> >> File8 H 1 F 2 4
>>> >> File9 H 2 M 1 1
>>> >> File10 H 2 M 1 2
>>> >> File11 H 2 M 2 3
>>> >> File12 H 2 M 2 4
>>> >> File13 H 2 F 1 1
>>> >> File14 H 2 F 1 2
>>> >> File15 H 2 F 2 3
>>> >> File16 H 2 F 2 4
>>> >> File17 L 3 M 1 1
>>> >> File18 L 3 M 1 2
>>> >> File19 L 3 M 2 3
>>> >> File20 L 3 M 2 4
>>> >> File21 L 3 F 1 1
>>> >> File22 L 3 F 1 2
>>> >> File23 L 3 F 2 3
>>> >> File24 L 3 F 2 4
>>> >> File25 L 4 M 1 1
>>> >> File26 L 4 M 1 2
>>> >> File27 L 4 M 2 3
>>> >> File28 L 4 M 2 4
>>> >> File29 L 4 F 1 1
>>> >> File30 L 4 F 1 2
>>> >> File31 L 4 F 2 3
>>> >> File32 L 4 F 2 4
>>> >> File33 A 5 M 1 1
>>> >> File34 A 5 M 1 2
>>> >> File35 A 5 M 2 3
>>> >> File36 A 5 M 2 4
>>> >> File37 A 5 F 1 1
>>> >> File38 A 5 F 1 2
>>> >> File39 A 5 F 2 3
>>> >> File40 A 5 F 2 4
>>> >> File41 A 6 M 1 1
>>> >> File42 A 6 M 1 2
>>> >> File43 A 6 M 2 3
>>> >> File44 A 6 M 2 4
>>> >> File45 A 6 F 1 1
>>> >> File46 A 6 F 1 2
>>> >> File47 A 6 F 2 3
>>> >> File48 A 6 F 2 4
>>>
>>> In total I have
>>> Factor "Phen", with 3 levels
>>> Factor "Line", nested in Phen, 6 levels
>>> Factor "Sex", 2 levels
>>> Factor "Batch", 2 levels
>>>
>>> I am interested in:
>>>
>>> 1) Effect of sex (M vs F)
>>> 2) Interaction between "Sex" and "Line" (or "Sex" and "Phen")
>>>
>>> Now, I can't really come up with a design matrix (not to mention
the
>>> contrast matrix).
>>>
>>> Naomi Altman wrote:
>>>> You can design this in limma quite readily. Nesting really just
means
>>>> that only a subset of the possible contrasts are of interest.
Just
>>>> create the appropriate contrast matrix and you are all set.
>>>
>>> I am not really sure with what you mean here. Should I treat all
the
>>> factors as in a factorial design?
>>> I might do something like this:
>>>
>>> phen <- factor(dummy$Phen)
>>> line <- factor(dummy$Line)
>>> sex <- factor(dummy$Sex)
>>> batch <- factor(dummy$Batch)
>>> fact <- factor(paste(sex,phen,line,sep="."))
>>> design <- model.matrix(~ 0 + fact + batch)
>>> colnames(design) <- c(levels(fact), "batch2")
>>> fit <- lmFit(dummy.eset,design)
>>> contrast <- makeContrasts(
>>> sex= (F.H.1 + F.H.2 + F.L.3 + F.L.4 + F.A.5 + F.A.6) -
(M.H.1 +
>>> M.H.2 + M.L.3 + M.L.4 + M.A.5 + M.A.6),
>>> levels=design)
>>> fit2 <- contrasts.fit(fit,contrast)
>>> fit2 <- eBayes(fit2)
>>>
>>> In this way I can correctly (I presume) obtain the effect of sex,
but
>>> how can I get the interaction term between sex and line?
>>> I presume there is a "easy" way, but I can't see it...
>>>
>>> Thanks,
>>> paolo
>>>
>>>
>>>>
>>>> --Naomi
>>>>
>>>> At 12:08 PM 2/16/2009, Paolo Innocenti wrote:
>>>>> Hi all,
>>>>>
>>>>> I have an experimental design for a Affy experiment that looks
like
>>>>> this:
>>>>>
>>>>> Phen Line Sex Biol.Rep.
>>>>> File1 H 1 M 1
>>>>> File2 H 1 M 2
>>>>> File3 H 1 F 1
>>>>> File4 H 1 F 2
>>>>> File5 H 2 M 1
>>>>> File6 H 2 M 2
>>>>> File7 H 2 F 1
>>>>> File8 H 2 F 2
>>>>> File9 L 3 M 1
>>>>> File10 L 3 M 2
>>>>> File11 L 3 F 1
>>>>> File12 L 3 F 2
>>>>> File13 L 4 M 1
>>>>> File14 L 4 M 2
>>>>> File15 L 4 F 1
>>>>> File16 L 4 F 2
>>>>>
>>>>>
>>>>> This appears to be a slightly more complicated situation than
the one
>>>>> proposed in the section 8.7 of the limma users guide (p.45) or
by
>>>>> Jenny on this post:
>>>>>
>>>>>
https://stat.ethz.ch/pipermail/bioconductor/2006-February/011965.html
>>>>>
>>>>> In particular, I am intersted in
>>>>> - Effect of "sex" (M vs F)
>>>>> - Interaction between "sex" and "phenotype ("line" nested)
>>>>> - Effect of "phenotype" in males
>>>>> - Effect of "phenotype" in females
>>>>>
>>>>> Line should be nested in phenotype, because they are random
"strains"
>>>>> that happened to end up in phenotype H or L.
>>>>>
>>>>> Can I design this in limma? Is there a source of information
about
>>>>> how to handle with this? In particular, can I design a single
model
>>>>> matrix and then choose the contrasts I am interested in?
>>>>>
>>>>> Any help is much appreciated,
>>>>> paolo
>>>>>
>>>>>
>>>>> --
>>>>> Paolo Innocenti
>>>>> Department of Animal Ecology, EBC
>>>>> Uppsala University
>>>>> Norbyv?gen 18D
>>>>> 75236 Uppsala, Sweden
>>>>>
>>>>> _______________________________________________
>>>>> Bioconductor mailing list
>>>>> Bioconductor at stat.math.ethz.ch
>>>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>>>> Search the archives:
>>>>>
http://news.gmane.org/gmane.science.biology.informatics.conductor
>>>>
>>>> Naomi S. Altman 814-865-3791
(voice)
>>>> Associate Professor
>>>> Dept. of Statistics 814-863-7114
(fax)
>>>> Penn State University 814-865-1348
(Statistics)
>>>> University Park, PA 16802-2111
>>>>
>>>>
>>>
>>
>> --
>> Paolo Innocenti
>> Department of Animal Ecology, EBC
>> Uppsala University
>> Norbyv?gen 18D
>> 75236 Uppsala, Sweden
>
--
Paolo Innocenti
Department of Animal Ecology, EBC
Uppsala University
Norbyv?gen 18D
75236 Uppsala, Sweden

ADD COMMENT
• link
•
modified 10.4 years ago
by
James W. MacDonald ♦

**51k**• written 10.4 years ago by Paolo Innocenti •**320**