Entering edit mode
Dear Tilahun,
There is no way to specify a random block in edgeR, nor in any other
existing statistical software designed to analyse RNA-Seq counts.
However there is no purpose in doing so for your experiment. The only
reason that you might want to define block as random rather than fixed
is
to recover inter-block information about the experimental treatments.
Your experiment is entirely balanced, in that all treatment conditions
appear in each block, so there is no inter-block information to
recover.
Hence the edgeR analysis that I have suggested, which uses intra-block
information only, is fully efficient.
If you want to test for a treatment effect in each individual tissue:
design <- model.matrix(~Tissue+Tissue:Treatment+Block)
Then
lrtA <- glmLRT(d, fit, coef="TissueA:Treatmentstressed")
finds genes DE between stress and control in tissue A,
lrtB <- glmLRT(d, fit, coef="TissueB:Treatmentstressed")
finds genes DE between stress and control in tissue B. And so on.
You will find it useful to examine colnames(design).
Best wishes
Gordon
> Date: Tue, 31 Jan 2012 11:24:49 -0600
> From: Tilahun Abebe <tilahun.abebe at="" uni.edu="">
> To: bioconductor at r-project.org
> Subject: Re: [BioC] Bioconductor Digest, Vol 107, Issue 31
>
> Dear Gordon,
>
> Thank you for the tip. I think I am making some progress. I should
have
> phrased my last question better. I meant what code do I use to get
genes
> differentially expressed in one tissue under control and stress
conditions.
>
> I have one more question. Is it possible to specify the random
effect in
> the design matrix in edgeR? I am thinking of something similar to
the
> RANDOM statement in SAS that will allow me to treat "block" as a
random
> effect?
>
> Cheers.
>
> Tilahun
> ----------------------------
>
>> Date: Tue, 31 Jan 2012 11:18:59 +1100 (AUS Eastern Daylight Time)
>> From: Gordon K Smyth <smyth at="" wehi.edu.au="">
>> To: Tilahun Abebe <tilahun.abebe at="" uni.edu="">
>> Cc: Bioconductor mailing list <bioconductor at="" r-project.org="">
>> Subject: [BioC] Please help! How to specify factors for a RCBD in
>> edgeR
>>
>> Dear Tilahun,
>>
>> The first step is that you need to create a data frame containing
the
>> experimental factors, just as you would for a SAS analysis. So you
need
>> to create three factors:
>>
>> Tissue
>> Treatment
>> Block
>>
>> each containing 24 values, one for each RNA sample. Then the
design marix
>> is formed by:
>>
>> design <- model.matrix(~Tissue*Treatment+Block)
>>
>> Type colnames(design) to see how the coefficients are defined. You
will
>> see that the interaction coefficients are coefficients 6 to 8.
>>
>> After fitting your linear model, you could find genes that show
>> significant Tissue*Treatment interaction (on 3df) by
>>
>> lrt <- glmLRT(d, fit, coef=6:8)
>>
>> and so on.
>>
>> I don't understand your question "How do I obtain differentially
expressed
>> genes in each Tissue*Stress combination?", so I can't give specific
advice
>> on that. Differentially expressed compared to what?
>>
>> Best wishes
>> Gordon
>>
>> ---------------- original message -------------------
>> [BioC] Please help! How to specify factors for a RCBD in edgeR
>> Tilahun Abebe tilahun.abebe at uni.edu
>> Wed Jan 25 22:16:41 CET 2012
>>
>> Hi,
>>
>> I am learning how to use edgeR to analyze RNA-seq data generated
from
>> Illumina GAII. The experimental design is fairly complex.
>>
>> I have a mixed 4x2 factorial randomized complete block design
(RCBD)
>> consisting of:
>>
>> 4 tissues: A, B, C, D
>> 2 treatments: control, stressed
>> 3 blocks: Block1, Block2, Block3
>>
>> Tissue and treatment are fixed effects and block is a random
effect.
>>
>> Here is the code I tried to use in edgeR:
>>
>>> counts <- read.delim( file = "Mycounts.txt", header = TRUE)
>>> rownames <-counts [ , 1 ]
>>> d <- counts [, 2:25] # counts are in columns 2-25
>>> d
>>> group <- c(rep("Control", 3), rep("Stress", 3), rep("Control", 3),
>> rep("Stress", 3), rep("Control", 3), rep("Stress", 3),
rep("Control", 3),
>> rep("Stress", 3))
>>> d <- DGEList(counts = d, group = group)
>>> design <- model.matrix(~group)
>>> d <- calcNormFactors(d)
>>> d$samples
>>> d <- estimateGLMCommonDisp(d, design)
>>> d <- estimateGLMTagwiseDisp(d, design)
>>> d$common.dispersion
>>> fit <- glmFit(d, design)
>>> lrt <- glmLRT(d, fit, coef=2)
>>> topTags(lrt, n=4)
>>
>> I am interested to know genes differentially expressed in each of
the four
>> tissues under stress. However, I feel like I am not specifying the
factors
>> correctly in the design statement. My questions are:
>>
>> 1) How do I specify the fixed effects Tissue, Stress, and
Tissue*Stress
>> interaction in the model?
>> 2) How do I tell edgeR to use block as a random effect?
>> 3) How do I obtain differentially expressed genes in each
Tissue*Stress
>> combination?
>>
>> I appreciate your help.
>>
>> Cheers.
>>
>> Tilahun Abebe, Ph.D.
>> University of northern Iowa
>> Cedar Falls, IA
______________________________________________________________________
The information in this email is confidential and
intend...{{dropped:4}}