edgeR factorial experiment contrasts

0

Entering edit mode

Guest User ★ 13k

@guest-user-4897

Last seen 9.6 years ago

Hi, I am trying to figure out how to do comparisons using edgeR but have to admit that I am on unsure ice. My experiment is abiotic stress on two genotypes (E and J) and these have been sampled for RNA-seq at five time points (00, 01, 05, 48, and 96). I would like to analyze this experiment as a factorial (which I am sure could be argued), and when I set up the following design: > design<-model.matrix(~0 + Group + Time + Group*Time), I get the following design matrix (I've taken out the 3 three replicates at experimental cell.): > design GroupE GroupJ Time01 Time05 Time48 Time96 GroupJ:Time01 GroupJ:Time05 GroupJ:Time48 GroupJ:Time96 E00 1 0 0 0 0 0 0 0 0 0 J00 0 1 0 0 0 0 0 0 0 0 E01 1 0 1 0 0 0 0 0 0 0 J01 0 1 1 0 0 0 1 0 0 0 E05 1 0 0 1 0 0 0 0 0 0 J05 0 1 0 1 0 0 0 1 0 0 E48 1 0 0 0 1 0 0 0 0 0 J48 0 1 0 0 1 0 0 0 1 0 E96 1 0 0 0 0 1 0 0 0 0 J96 0 1 0 0 0 1 0 0 0 1 attr(,"assign") [1] 1 1 2 2 2 2 3 3 3 3 attr(,"contrasts") attr(,"contrasts")$Group [1] "contr.treatment" attr(,"contrasts")$Time [1] "contr.treatment" I have some questions regarding testing contrasts: First though, I realize that I can pull out the data for E00 and J00, say, and compare these using exactTest, but then I guess that you are disregarding all information that lies in the remaining data set? That does not seem to me to be a good way to do it. But I may be wrong here? Second, if I want to do the comparison E00 vs J00 using the design indicated above, how should that be set up? Would lrt <- glmLRT(fit,contrast=c(1,-1,0,0,0,0,0,0,0,0)) do it? Third, if I want to compare (E00 + E01 + E05) vs (E48 + E96), how would I set up that contrast? For my own pedagogic reasons I can show how I was thinking in this case: E00 = L1 E01 = L1 + L4 x 2 --> 6L1 + 2L3 + 2L4 E05 = L1 + L4 versus --> 2L3 + 2L4 - 3L5 -3L6 E48 = L1 + L5 x 3 --> 6L1 + 3L5 + 3L6 E96 = L1 + L6 resulting in the contrast: lrt <- glmLRT(fit,contrast=c(0,0,2,2,-3,-3,0,0,0,0)) Then I seem to remember that contrasts should sum up to zero? If anybody could tell me where I am loosing it, I would be very obliged. Thank you jd -- output of sessionInfo(): R version 3.1.0 (2014-04-10) Platform: x86_64-w64-mingw32/x64 (64-bit) locale: [1] LC_COLLATE=Norwegian (Bokm??l)_Norway.1252 LC_CTYPE=Norwegian (Bokm??l)_Norway.1252 [3] LC_MONETARY=Norwegian (Bokm??l)_Norway.1252 LC_NUMERIC=C [5] LC_TIME=Norwegian (Bokm??l)_Norway.1252 attached base packages: [1] splines stats graphics grDevices utils datasets methods base other attached packages: [1] edgeR_3.6.8 limma_3.20.9 BiocInstaller_1.14.2 loaded via a namespace (and not attached): [1] tools_3.1.0 > -- Sent via the guest posting facility at bioconductor.org.

edgeR • 1.6k views

ADD COMMENT • link updated 9.7 years ago by James W. MacDonald 65k • written 9.7 years ago by Guest User ★ 13k

0

Entering edit mode

"On thin ice", would be the more idiomatic expression.

ADD REPLY • link 9.4 years ago Aaron Lun ★ 28k

0

Entering edit mode

James W. MacDonald 65k

@james-w-macdonald-5106

Last seen 1 day ago

United States

Hi Jahn, Things will go much easier for you if you just parameterize as cell means rather than a conventional factorial model. grps <- factor(paste(Group, Time, sep = "_")) design <- model.matrix (~0+grps) Then all the comparisons are easy to do. Best, Jim On Wed, Sep 3, 2014 at 12:39 PM, Jahn Davik [guest] <guest at="" bioconductor.org=""> wrote: > Hi, > I am trying to figure out how to do comparisons using edgeR but have to > admit that I am on unsure ice. > My experiment is abiotic stress on two genotypes (E and J) and these have > been sampled for RNA-seq at five time points (00, 01, 05, 48, and 96). > I would like to analyze this experiment as a factorial (which I am sure > could be argued), and when I set up the following design: > > > design<-model.matrix(~0 + Group + Time + Group*Time), > > I get the following design matrix (I've taken out the 3 three replicates > at experimental cell.): > > > design > GroupE GroupJ Time01 Time05 Time48 Time96 GroupJ:Time01 > GroupJ:Time05 GroupJ:Time48 GroupJ:Time96 > E00 1 0 0 0 0 0 0 0 > 0 0 > J00 0 1 0 0 0 0 0 0 > 0 0 > E01 1 0 1 0 0 0 0 0 > 0 0 > J01 0 1 1 0 0 0 1 0 > 0 0 > E05 1 0 0 1 0 0 0 0 > 0 0 > J05 0 1 0 1 0 0 0 1 > 0 0 > E48 1 0 0 0 1 0 0 0 > 0 0 > J48 0 1 0 0 1 0 0 0 > 1 0 > E96 1 0 0 0 0 1 0 0 > 0 0 > J96 0 1 0 0 0 1 0 0 > 0 1 > attr(,"assign") > [1] 1 1 2 2 2 2 3 3 3 3 > attr(,"contrasts") > attr(,"contrasts")$Group > [1] "contr.treatment" > > attr(,"contrasts")$Time > [1] "contr.treatment" > > > I have some questions regarding testing contrasts: > First though, I realize that I can pull out the data for E00 and J00, say, > and compare these using exactTest, but then I guess that you are > disregarding all information that lies in the remaining data set? That does > not seem to me to be a good way to do it. But I may be wrong here? > > Second, if I want to do the comparison E00 vs J00 using the design > indicated above, how should that be set up? Would > > lrt <- glmLRT(fit,contrast=c(1,-1,0,0,0,0,0,0,0,0)) do it? > > Third, if I want to compare (E00 + E01 + E05) vs (E48 + E96), how would I > set up that contrast? > For my own pedagogic reasons I can show how I was thinking in this case: > > E00 = L1 > E01 = L1 + L4 x 2 --> 6L1 + 2L3 + 2L4 > E05 = L1 + L4 > > versus --> 2L3 + 2L4 - 3L5 -3L6 > > E48 = L1 + L5 x 3 --> 6L1 + 3L5 + 3L6 > E96 = L1 + L6 > > > resulting in the contrast: > > lrt <- glmLRT(fit,contrast=c(0,0,2,2,-3,-3,0,0,0,0)) > > Then I seem to remember that contrasts should sum up to zero? If anybody > could tell me where I am loosing it, I would be very obliged. > > Thank you > > jd > > > > > > -- output of sessionInfo(): > > R version 3.1.0 (2014-04-10) > Platform: x86_64-w64-mingw32/x64 (64-bit) > > locale: > [1] LC_COLLATE=Norwegian (Bokm??l)_Norway.1252 LC_CTYPE=Norwegian > (Bokm??l)_Norway.1252 > [3] LC_MONETARY=Norwegian (Bokm??l)_Norway.1252 LC_NUMERIC=C > [5] LC_TIME=Norwegian (Bokm??l)_Norway.1252 > > attached base packages: > [1] splines stats graphics grDevices utils datasets methods > base > > other attached packages: > [1] edgeR_3.6.8 limma_3.20.9 BiocInstaller_1.14.2 > > loaded via a namespace (and not attached): > [1] tools_3.1.0 > > > > -- > Sent via the guest posting facility at bioconductor.org. > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > -- James W. MacDonald, M.S. Biostatistician University of Washington Environmental and Occupational Health Sciences 4225 Roosevelt Way NE, # 100 Seattle WA 98105-6099 [[alternative HTML version deleted]]

ADD COMMENT • link 9.7 years ago James W. MacDonald 65k

0

Entering edit mode

Hi Jim, Yes, I believe I can see that (except for possibly my third question). It's just that I tend to think along these lines. Are they wrong? jahn ________________________________ Fra: James W. MacDonald [jmacdon at uw.edu] Sendt: 3. september 2014 22:06 Til: Jahn Davik [guest] Kopi: bioconductor at r-project.org; Jahn Davik Emne: Re: [BioC] edgeR factorial experiment contrasts Hi Jahn, Things will go much easier for you if you just parameterize as cell means rather than a conventional factorial model. grps <- factor(paste(Group, Time, sep = "_")) design <- model.matrix (~0+grps) Then all the comparisons are easy to do. Best, Jim On Wed, Sep 3, 2014 at 12:39 PM, Jahn Davik [guest] <guest at="" bioconductor.org<mailto:guest="" at="" bioconductor.org="">> wrote: Hi, I am trying to figure out how to do comparisons using edgeR but have to admit that I am on unsure ice. My experiment is abiotic stress on two genotypes (E and J) and these have been sampled for RNA-seq at five time points (00, 01, 05, 48, and 96). I would like to analyze this experiment as a factorial (which I am sure could be argued), and when I set up the following design: > design<-model.matrix(~0 + Group + Time + Group*Time), I get the following design matrix (I've taken out the 3 three replicates at experimental cell.): > design GroupE GroupJ Time01 Time05 Time48 Time96 GroupJ:Time01 GroupJ:Time05 GroupJ:Time48 GroupJ:Time96 E00 1 0 0 0 0 0 0 0 0 0 J00 0 1 0 0 0 0 0 0 0 0 E01 1 0 1 0 0 0 0 0 0 0 J01 0 1 1 0 0 0 1 0 0 0 E05 1 0 0 1 0 0 0 0 0 0 J05 0 1 0 1 0 0 0 1 0 0 E48 1 0 0 0 1 0 0 0 0 0 J48 0 1 0 0 1 0 0 0 1 0 E96 1 0 0 0 0 1 0 0 0 0 J96 0 1 0 0 0 1 0 0 0 1 attr(,"assign") [1] 1 1 2 2 2 2 3 3 3 3 attr(,"contrasts") attr(,"contrasts")$Group [1] "contr.treatment" attr(,"contrasts")$Time [1] "contr.treatment" I have some questions regarding testing contrasts: First though, I realize that I can pull out the data for E00 and J00, say, and compare these using exactTest, but then I guess that you are disregarding all information that lies in the remaining data set? That does not seem to me to be a good way to do it. But I may be wrong here? Second, if I want to do the comparison E00 vs J00 using the design indicated above, how should that be set up? Would lrt <- glmLRT(fit,contrast=c(1,-1,0,0,0,0,0,0,0,0)) do it? Third, if I want to compare (E00 + E01 + E05) vs (E48 + E96), how would I set up that contrast? For my own pedagogic reasons I can show how I was thinking in this case: E00 = L1 E01 = L1 + L4 x 2 --> 6L1 + 2L3 + 2L4 E05 = L1 + L4 versus --> 2L3 + 2L4 - 3L5 -3L6 E48 = L1 + L5 x 3 --> 6L1 + 3L5 + 3L6 E96 = L1 + L6 resulting in the contrast: lrt <- glmLRT(fit,contrast=c(0,0,2,2,-3,-3,0,0,0,0)) Then I seem to remember that contrasts should sum up to zero? If anybody could tell me where I am loosing it, I would be very obliged. Thank you jd -- output of sessionInfo(): R version 3.1.0 (2014-04-10) Platform: x86_64-w64-mingw32/x64 (64-bit) locale: [1] LC_COLLATE=Norwegian (Bokm?l)_Norway.1252 LC_CTYPE=Norwegian (Bokm?l)_Norway.1252 [3] LC_MONETARY=Norwegian (Bokm?l)_Norway.1252 LC_NUMERIC=C [5] LC_TIME=Norwegian (Bokm?l)_Norway.1252 attached base packages: [1] splines stats graphics grDevices utils datasets methods base other attached packages: [1] edgeR_3.6.8 limma_3.20.9 BiocInstaller_1.14.2 loaded via a namespace (and not attached): [1] tools_3.1.0 > -- Sent via the guest posting facility at bioconductor.org<http: bioconductor.org="">. _______________________________________________ Bioconductor mailing list Bioconductor at r-project.org<mailto:bioconductor at="" r-project.org=""> https://stat.ethz.ch/mailman/listinfo/bioconductor Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor -- James W. MacDonald, M.S. Biostatistician University of Washington Environmental and Occupational Health Sciences 4225 Roosevelt Way NE, # 100 Seattle WA 98105-6099 [[alternative HTML version deleted]]

ADD REPLY • link 9.7 years ago Jahn Davik ▴ 110

0

Entering edit mode

Hi Jahn, No, it's not wrong. You can parameterize any number of ways, and you will get identical results for equivalent contrasts. But given that you will get identical results, and one way is conceptually easier to figure out, I would in general advocate using the conceptually easier parameterization over a more difficult one that requires lots of algebraic contortions in order to figure out what contrast to use. I believe you sent a truncated version of your experiment (there were no replicates at each time point), so I can't give you explicit code. However, say you have a design matrix that looks something like this: E00 J00 E01 J01 E05 J05 E48 J48 E96 J96 <zeros and="" ones="" go="" here=""> And you want to compare the third contrast you mention below. It's as simple as glmLRT(fit, contrast = c(0.33, 0, 0.33, 0,0.33, 0,-0.5, 0, -0.5, 0)) Which tests the average of the E0-5 time points against the average of the E48-96 time points. And interactions are simple as well. Say you want the interaction between times 0 and 96: glmLRT(fit, contrast = c(1,-1,0,0,0,0,0,0,-1,1)) Simple, no? Best, Jim On Wed, Sep 3, 2014 at 4:10 PM, Jahn Davik <jahn.davik at="" bioforsk.no=""> wrote: > Hi Jim, > Yes, I believe I can see that (except for possibly my third question). > It's just that I tend to think along these lines. Are they wrong? > > jahn > > ------------------------------ > *Fra:* James W. MacDonald [jmacdon at uw.edu] > *Sendt:* 3. september 2014 22:06 > *Til:* Jahn Davik [guest] > *Kopi:* bioconductor at r-project.org; Jahn Davik > *Emne:* Re: [BioC] edgeR factorial experiment contrasts > > Hi Jahn, > > Things will go much easier for you if you just parameterize as cell > means rather than a conventional factorial model. > > grps <- factor(paste(Group, Time, sep = "_")) > > design <- model.matrix (~0+grps) > > Then all the comparisons are easy to do. > > Best, > > Jim > > > > On Wed, Sep 3, 2014 at 12:39 PM, Jahn Davik [guest] < > guest at bioconductor.org> wrote: > >> Hi, >> I am trying to figure out how to do comparisons using edgeR but have to >> admit that I am on unsure ice. >> My experiment is abiotic stress on two genotypes (E and J) and these have >> been sampled for RNA-seq at five time points (00, 01, 05, 48, and 96). >> I would like to analyze this experiment as a factorial (which I am sure >> could be argued), and when I set up the following design: >> >> > design<-model.matrix(~0 + Group + Time + Group*Time), >> >> I get the following design matrix (I've taken out the 3 three replicates >> at experimental cell.): >> >> > design >> GroupE GroupJ Time01 Time05 Time48 Time96 GroupJ:Time01 >> GroupJ:Time05 GroupJ:Time48 GroupJ:Time96 >> E00 1 0 0 0 0 0 0 >> 0 0 0 >> J00 0 1 0 0 0 0 0 >> 0 0 0 >> E01 1 0 1 0 0 0 0 >> 0 0 0 >> J01 0 1 1 0 0 0 1 >> 0 0 0 >> E05 1 0 0 1 0 0 0 >> 0 0 0 >> J05 0 1 0 1 0 0 0 >> 1 0 0 >> E48 1 0 0 0 1 0 0 >> 0 0 0 >> J48 0 1 0 0 1 0 0 >> 0 1 0 >> E96 1 0 0 0 0 1 0 >> 0 0 0 >> J96 0 1 0 0 0 1 0 >> 0 0 1 >> attr(,"assign") >> [1] 1 1 2 2 2 2 3 3 3 3 >> attr(,"contrasts") >> attr(,"contrasts")$Group >> [1] "contr.treatment" >> >> attr(,"contrasts")$Time >> [1] "contr.treatment" >> >> >> I have some questions regarding testing contrasts: >> First though, I realize that I can pull out the data for E00 and J00, >> say, and compare these using exactTest, but then I guess that you are >> disregarding all information that lies in the remaining data set? That does >> not seem to me to be a good way to do it. But I may be wrong here? >> >> Second, if I want to do the comparison E00 vs J00 using the design >> indicated above, how should that be set up? Would >> >> lrt <- glmLRT(fit,contrast=c(1,-1,0,0,0,0,0,0,0,0)) do it? >> >> Third, if I want to compare (E00 + E01 + E05) vs (E48 + E96), how would I >> set up that contrast? >> For my own pedagogic reasons I can show how I was thinking in this case: >> >> E00 = L1 >> E01 = L1 + L4 x 2 --> 6L1 + 2L3 + 2L4 >> E05 = L1 + L4 >> >> versus --> 2L3 + 2L4 - 3L5 -3L6 >> >> E48 = L1 + L5 x 3 --> 6L1 + 3L5 + 3L6 >> E96 = L1 + L6 >> >> >> resulting in the contrast: >> >> lrt <- glmLRT(fit,contrast=c(0,0,2,2,-3,-3,0,0,0,0)) >> >> Then I seem to remember that contrasts should sum up to zero? If anybody >> could tell me where I am loosing it, I would be very obliged. >> >> Thank you >> >> jd >> >> >> >> >> >> -- output of sessionInfo(): >> >> R version 3.1.0 (2014-04-10) >> Platform: x86_64-w64-mingw32/x64 (64-bit) >> >> locale: >> [1] LC_COLLATE=Norwegian (Bokm??l)_Norway.1252 LC_CTYPE=Norwegian >> (Bokm??l)_Norway.1252 >> [3] LC_MONETARY=Norwegian (Bokm??l)_Norway.1252 LC_NUMERIC=C >> [5] LC_TIME=Norwegian (Bokm??l)_Norway.1252 >> >> attached base packages: >> [1] splines stats graphics grDevices utils datasets methods >> base >> >> other attached packages: >> [1] edgeR_3.6.8 limma_3.20.9 BiocInstaller_1.14.2 >> >> loaded via a namespace (and not attached): >> [1] tools_3.1.0 >> > >> >> -- >> Sent via the guest posting facility at bioconductor.org. >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at r-project.org >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: >> http://news.gmane.org/gmane.science.biology.informatics.conductor >> > > > > -- > James W. MacDonald, M.S. > Biostatistician > University of Washington > Environmental and Occupational Health Sciences > 4225 Roosevelt Way NE, # 100 > Seattle WA 98105-6099 > -- James W. MacDonald, M.S. Biostatistician University of Washington Environmental and Occupational Health Sciences 4225 Roosevelt Way NE, # 100 Seattle WA 98105-6099 [[alternative HTML version deleted]]

ADD REPLY • link 9.7 years ago James W. MacDonald 65k

0

Entering edit mode

Hi Jim, Thanks for clarification. Appreciate it. For the sake of it I ran a couple of comparisons between the one-way approach that you advocate and the factorial approach. For the simple comparisons, e.g., E00 vs J00, the two approaches returned pretty much the same results. When I did the comparison ?average(E00+E01+E05)? vs ?average(E48+E96)? the topTags were pretty much equal. Still, a big fraction (~ 15%) of the significant (FDR<.001) transcripts found in one of the groups, were not found in the other. I suspect that can be explained by including the interactions in the factorial approach and not taking them into account in the one-way approach? Another thing about doing the factorial approach is that you end up with a contrast statement something like: lrt <- glmLRT(fit,contrast=c(0,0,2,2,-3,-3,0,0,0,0)) (the design matrix is slightly different from the one you commented upon below) where the coefficients certainly don?t add up to zero. This comes from the ?algebraic contortions? as you call them. I got the impression from your correspondence with Julia Pickl on this site that contrast coefficients should sum up to zero? But that may only apply to her situation? Finally, I would be pleased to hear thoughts on which approach is statistically ?best?. I?ve been under the impression in statistically analyses that using the entire data set from an experiment is good practice, even if one makes simple pairwise comparison. Does edgeR do that? Thank you. Best, jahn Fra: James W. MacDonald [mailto:jmacdon at uw.edu] Sendt: 3. september 2014 23:38 Til: Jahn Davik Kopi: Jahn Davik [guest]; bioconductor at r-project.org Emne: Re: [BioC] edgeR factorial experiment contrasts Hi Jahn, No, it's not wrong. You can parameterize any number of ways, and you will get identical results for equivalent contrasts. But given that you will get identical results, and one way is conceptually easier to figure out, I would in general advocate using the conceptually easier parameterization over a more difficult one that requires lots of algebraic contortions in order to figure out what contrast to use. I believe you sent a truncated version of your experiment (there were no replicates at each time point), so I can't give you explicit code. However, say you have a design matrix that looks something like this: E00 J00 E01 J01 E05 J05 E48 J48 E96 J96 <zeros and="" ones="" go="" here=""> And you want to compare the third contrast you mention below. It's as simple as glmLRT(fit, contrast = c(0.33, 0, 0.33, 0,0.33, 0,-0.5, 0, -0.5, 0)) Which tests the average of the E0-5 time points against the average of the E48-96 time points. And interactions are simple as well. Say you want the interaction between times 0 and 96: glmLRT(fit, contrast = c(1,-1,0,0,0,0,0,0,-1,1)) Simple, no? Best, Jim On Wed, Sep 3, 2014 at 4:10 PM, Jahn Davik <jahn.davik at="" bioforsk.no<mailto:jahn.davik="" at="" bioforsk.no="">> wrote: Hi Jim, Yes, I believe I can see that (except for possibly my third question). It's just that I tend to think along these lines. Are they wrong? jahn ________________________________ Fra: James W. MacDonald [jmacdon at uw.edu<mailto:jmacdon at="" uw.edu="">] Sendt: 3. september 2014 22:06 Til: Jahn Davik [guest] Kopi: bioconductor at r-project.org<mailto:bioconductor at="" r-project.org="">; Jahn Davik Emne: Re: [BioC] edgeR factorial experiment contrasts Hi Jahn, Things will go much easier for you if you just parameterize as cell means rather than a conventional factorial model. grps <- factor(paste(Group, Time, sep = "_")) design <- model.matrix (~0+grps) Then all the comparisons are easy to do. Best, Jim On Wed, Sep 3, 2014 at 12:39 PM, Jahn Davik [guest] <guest at="" bioconductor.org<mailto:guest="" at="" bioconductor.org="">> wrote: Hi, I am trying to figure out how to do comparisons using edgeR but have to admit that I am on unsure ice. My experiment is abiotic stress on two genotypes (E and J) and these have been sampled for RNA-seq at five time points (00, 01, 05, 48, and 96). I would like to analyze this experiment as a factorial (which I am sure could be argued), and when I set up the following design: > design<-model.matrix(~0 + Group + Time + Group*Time), I get the following design matrix (I've taken out the 3 three replicates at experimental cell.): > design GroupE GroupJ Time01 Time05 Time48 Time96 GroupJ:Time01 GroupJ:Time05 GroupJ:Time48 GroupJ:Time96 E00 1 0 0 0 0 0 0 0 0 0 J00 0 1 0 0 0 0 0 0 0 0 E01 1 0 1 0 0 0 0 0 0 0 J01 0 1 1 0 0 0 1 0 0 0 E05 1 0 0 1 0 0 0 0 0 0 J05 0 1 0 1 0 0 0 1 0 0 E48 1 0 0 0 1 0 0 0 0 0 J48 0 1 0 0 1 0 0 0 1 0 E96 1 0 0 0 0 1 0 0 0 0 J96 0 1 0 0 0 1 0 0 0 1 attr(,"assign") [1] 1 1 2 2 2 2 3 3 3 3 attr(,"contrasts") attr(,"contrasts")$Group [1] "contr.treatment" attr(,"contrasts")$Time [1] "contr.treatment" I have some questions regarding testing contrasts: First though, I realize that I can pull out the data for E00 and J00, say, and compare these using exactTest, but then I guess that you are disregarding all information that lies in the remaining data set? That does not seem to me to be a good way to do it. But I may be wrong here? Second, if I want to do the comparison E00 vs J00 using the design indicated above, how should that be set up? Would lrt <- glmLRT(fit,contrast=c(1,-1,0,0,0,0,0,0,0,0)) do it? Third, if I want to compare (E00 + E01 + E05) vs (E48 + E96), how would I set up that contrast? For my own pedagogic reasons I can show how I was thinking in this case: E00 = L1 E01 = L1 + L4 x 2 --> 6L1 + 2L3 + 2L4 E05 = L1 + L4 versus --> 2L3 + 2L4 - 3L5 -3L6 E48 = L1 + L5 x 3 --> 6L1 + 3L5 + 3L6 E96 = L1 + L6 resulting in the contrast: lrt <- glmLRT(fit,contrast=c(0,0,2,2,-3,-3,0,0,0,0)) Then I seem to remember that contrasts should sum up to zero? If anybody could tell me where I am loosing it, I would be very obliged. Thank you jd -- output of sessionInfo(): R version 3.1.0 (2014-04-10) Platform: x86_64-w64-mingw32/x64 (64-bit) locale: [1] LC_COLLATE=Norwegian (Bokm??l)_Norway.1252 LC_CTYPE=Norwegian (Bokm??l)_Norway.1252 [3] LC_MONETARY=Norwegian (Bokm??l)_Norway.1252 LC_NUMERIC=C [5] LC_TIME=Norwegian (Bokm??l)_Norway.1252 attached base packages: [1] splines stats graphics grDevices utils datasets methods base other attached packages: [1] edgeR_3.6.8 limma_3.20.9 BiocInstaller_1.14.2 loaded via a namespace (and not attached): [1] tools_3.1.0 > -- Sent via the guest posting facility at bioconductor.org<http: bioconductor.org="">. _______________________________________________ Bioconductor mailing list Bioconductor at r-project.org<mailto:bioconductor at="" r-project.org=""> https://stat.ethz.ch/mailman/listinfo/bioconductor Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor -- James W. MacDonald, M.S. Biostatistician University of Washington Environmental and Occupational Health Sciences 4225 Roosevelt Way NE, # 100 Seattle WA 98105-6099 -- James W. MacDonald, M.S. Biostatistician University of Washington Environmental and Occupational Health Sciences 4225 Roosevelt Way NE, # 100 Seattle WA 98105-6099 [[alternative HTML version deleted]]

ADD REPLY • link 9.7 years ago Jahn Davik ▴ 110

0

Entering edit mode

Hi Jahn, On Thu, Sep 4, 2014 at 5:16 AM, Jahn Davik <jahn.davik at="" bioforsk.no=""> wrote: > Hi Jim, > > Thanks for clarification. Appreciate it. > > > > For the sake of it I ran a couple of comparisons between the one-way > approach that you advocate and the factorial approach. For the simple > comparisons, e.g., E00 vs J00, the two approaches returned pretty much the > same results. > This comparison is not simple when you fit a factorial model. If you used the contrast you mentioned earlier (something like c(-1,1,0,0,0,...)), then you aren't comparing E00 vs J00. Instead you are comparing the main effects for E and J, which is not the same as comparing E00 and J00 (which is a direct comparison of the average expression for genes at time zero). The main effect for E and J is a comparison of the overall means for the two groups, rather than the means at time zero. When I did the comparison ?average(E00+E01+E05)? vs ?average(E48+E96)? the > topTags were pretty much equal. Still, a big fraction (~ 15%) of the > significant (FDR<.001) transcripts found in one of the groups, were not > found in the other. I suspect that can be explained by including the > interactions in the factorial approach and not taking them into account in > the one-way approach? > No. Like I said before, if you compute equivalent contrasts, you will get identical results regardless of the underlying parameterization. If you are getting differences it is because the contrasts you are computing for the two parameterizations are not equivalent (e.g., you aren't making the same comparisons). This has nothing to do with the presence or absence of interaction terms. > > > Another thing about doing the factorial approach is that you end up with a > contrast statement something like: > > > > lrt <- glmLRT(fit,contrast=c(0,0,2,2,-3,-3,0,0,0,0)) > Without seeing the design matrix and taking the time to interpret the coefficients, I cannot say for sure if this is correct or not. I doubt it is correct, but cannot say for sure. As you note, I did tell Julia Pickl that coefficients for a contrast usually sum to zero. However, if the coefficients being compared are already comparisons themselves, you have to interpret the contrast in light of the underlying coefficients. But this brings us back to the main point I have been trying to make all along. You certainly can use a factorial parameterization for this analysis, but there is no logical reason for doing so. The factorial parameterization is decomposing your data into coefficients that are not really useful in this context, and then you have to do all this work to figure out the interpretation for each coefficient in order to then figure out what contrast you should be using. The alternative is to parameterize your model in a very simple way that you can then easily construct contrasts to make any comparisons you might care to make. Why you want to do things the hard way instead of the easy way eludes me. But don't take my word for it; this is something that Gordon Smyth has been talking about both on this list and in his user's guides for years now. As an example, see section 9.5.4 in the limma User's Guide, where he shows a couple of factorial parameterizations and the required contrasts. And he ends with this sentence: "Since the approaches are equivalent, users are free to choose whichever one is most intuitive or convenient." Best, Jim > > (the design matrix is slightly different from the one you commented upon > below) where the coefficients certainly don?t add up to zero. This comes > from the ?algebraic contortions? as you call them. I got the impression > from your correspondence with Julia Pickl on this site that contrast > coefficients should sum up to zero? But that may only apply to her > situation? > > > > Finally, I would be pleased to hear thoughts on which approach is > statistically ?best?. I?ve been under the impression in statistically > analyses that using the entire data set from an experiment is good > practice, even if one makes simple pairwise comparison. Does edgeR do that? > > > > Thank you. > > > > Best, > > jahn > > > > > > *Fra:* James W. MacDonald [mailto:jmacdon at uw.edu] > *Sendt:* 3. september 2014 23:38 > *Til:* Jahn Davik > *Kopi:* Jahn Davik [guest]; bioconductor at r-project.org > > *Emne:* Re: [BioC] edgeR factorial experiment contrasts > > > > Hi Jahn, > > > > No, it's not wrong. You can parameterize any number of ways, and you will > get identical results for equivalent contrasts. But given that you will get > identical results, and one way is conceptually easier to figure out, I > would in general advocate using the conceptually easier parameterization > over a more difficult one that requires lots of algebraic contortions in > order to figure out what contrast to use. > > > > I believe you sent a truncated version of your experiment (there were no > replicates at each time point), so I can't give you explicit code. However, > say you have a design matrix that looks something like this: > > > > E00 J00 E01 J01 E05 J05 E48 J48 E96 J96 > > <zeros and="" ones="" go="" here=""> > > > > And you want to compare the third contrast you mention below. It's as > simple as > > > > glmLRT(fit, contrast = c(0.33, 0, 0.33, 0,0.33, 0,-0.5, 0, -0.5, 0)) > > > > Which tests the average of the E0-5 time points against the average of the > E48-96 time points. > > > > And interactions are simple as well. Say you want the interaction between > times 0 and 96: > > > > glmLRT(fit, contrast = c(1,-1,0,0,0,0,0,0,-1,1)) > > > > Simple, no? > > > > Best, > > > > Jim > > > > > > > > On Wed, Sep 3, 2014 at 4:10 PM, Jahn Davik <jahn.davik at="" bioforsk.no=""> wrote: > > Hi Jim, > Yes, I believe I can see that (except for possibly my third question). > It's just that I tend to think along these lines. Are they wrong? > > jahn > ------------------------------ > > *Fra:* James W. MacDonald [jmacdon at uw.edu] > *Sendt:* 3. september 2014 22:06 > *Til:* Jahn Davik [guest] > *Kopi:* bioconductor at r-project.org; Jahn Davik > *Emne:* Re: [BioC] edgeR factorial experiment contrasts > > Hi Jahn, > > > > Things will go much easier for you if you just parameterize as cell means > rather than a conventional factorial model. > > > > grps <- factor(paste(Group, Time, sep = "_")) > > > > design <- model.matrix (~0+grps) > > > > Then all the comparisons are easy to do. > > > > Best, > > > > Jim > > > > > > > > On Wed, Sep 3, 2014 at 12:39 PM, Jahn Davik [guest] < > guest at bioconductor.org> wrote: > > Hi, > I am trying to figure out how to do comparisons using edgeR but have to > admit that I am on unsure ice. > My experiment is abiotic stress on two genotypes (E and J) and these have > been sampled for RNA-seq at five time points (00, 01, 05, 48, and 96). > I would like to analyze this experiment as a factorial (which I am sure > could be argued), and when I set up the following design: > > > design<-model.matrix(~0 + Group + Time + Group*Time), > > I get the following design matrix (I've taken out the 3 three replicates > at experimental cell.): > > > design > GroupE GroupJ Time01 Time05 Time48 Time96 GroupJ:Time01 > GroupJ:Time05 GroupJ:Time48 GroupJ:Time96 > E00 1 0 0 0 0 0 0 0 > 0 0 > J00 0 1 0 0 0 0 0 0 > 0 0 > E01 1 0 1 0 0 0 0 0 > 0 0 > J01 0 1 1 0 0 0 1 0 > 0 0 > E05 1 0 0 1 0 0 0 0 > 0 0 > J05 0 1 0 1 0 0 0 1 > 0 0 > E48 1 0 0 0 1 0 0 0 > 0 0 > J48 0 1 0 0 1 0 0 0 > 1 0 > E96 1 0 0 0 0 1 0 0 > 0 0 > J96 0 1 0 0 0 1 0 0 > 0 1 > attr(,"assign") > [1] 1 1 2 2 2 2 3 3 3 3 > attr(,"contrasts") > attr(,"contrasts")$Group > [1] "contr.treatment" > > attr(,"contrasts")$Time > [1] "contr.treatment" > > > I have some questions regarding testing contrasts: > First though, I realize that I can pull out the data for E00 and J00, say, > and compare these using exactTest, but then I guess that you are > disregarding all information that lies in the remaining data set? That does > not seem to me to be a good way to do it. But I may be wrong here? > > Second, if I want to do the comparison E00 vs J00 using the design > indicated above, how should that be set up? Would > > lrt <- glmLRT(fit,contrast=c(1,-1,0,0,0,0,0,0,0,0)) do it? > > Third, if I want to compare (E00 + E01 + E05) vs (E48 + E96), how would I > set up that contrast? > For my own pedagogic reasons I can show how I was thinking in this case: > > E00 = L1 > E01 = L1 + L4 x 2 --> 6L1 + 2L3 + 2L4 > E05 = L1 + L4 > > versus --> 2L3 + 2L4 - 3L5 -3L6 > > E48 = L1 + L5 x 3 --> 6L1 + 3L5 + 3L6 > E96 = L1 + L6 > > > resulting in the contrast: > > lrt <- glmLRT(fit,contrast=c(0,0,2,2,-3,-3,0,0,0,0)) > > Then I seem to remember that contrasts should sum up to zero? If anybody > could tell me where I am loosing it, I would be very obliged. > > Thank you > > jd > > > > > > -- output of sessionInfo(): > > R version 3.1.0 (2014-04-10) > Platform: x86_64-w64-mingw32/x64 (64-bit) > > locale: > [1] LC_COLLATE=Norwegian (Bokm??l)_Norway.1252 LC_CTYPE=Norwegian > (Bokm??l)_Norway.1252 > [3] LC_MONETARY=Norwegian (Bokm??l)_Norway.1252 LC_NUMERIC=C > [5] LC_TIME=Norwegian (Bokm??l)_Norway.1252 > > attached base packages: > [1] splines stats graphics grDevices utils datasets methods > base > > other attached packages: > [1] edgeR_3.6.8 limma_3.20.9 BiocInstaller_1.14.2 > > loaded via a namespace (and not attached): > [1] tools_3.1.0 > > > > -- > Sent via the guest posting facility at bioconductor.org. > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > > > > > > -- > > James W. MacDonald, M.S. > Biostatistician > University of Washington > Environmental and Occupational Health Sciences > 4225 Roosevelt Way NE, # 100 > Seattle WA 98105-6099 > > > > > > -- > > James W. MacDonald, M.S. > Biostatistician > University of Washington > Environmental and Occupational Health Sciences > 4225 Roosevelt Way NE, # 100 > Seattle WA 98105-6099 > -- James W. MacDonald, M.S. Biostatistician University of Washington Environmental and Occupational Health Sciences 4225 Roosevelt Way NE, # 100 Seattle WA 98105-6099 [[alternative HTML version deleted]]

ADD REPLY • link 9.7 years ago James W. MacDonald 65k

0

Entering edit mode

Hi Jim, Thank you for the thorough feedback. Appreciate it. Best jahn Fra: James W. MacDonald [mailto:jmacdon at uw.edu] Sendt: 4. september 2014 16:40 Til: Jahn Davik Kopi: Jahn Davik [guest]; bioconductor at r-project.org Emne: Re: [BioC] edgeR factorial experiment contrasts Hi Jahn, On Thu, Sep 4, 2014 at 5:16 AM, Jahn Davik <jahn.davik at="" bioforsk.no<mailto:jahn.davik="" at="" bioforsk.no="">> wrote: Hi Jim, Thanks for clarification. Appreciate it. For the sake of it I ran a couple of comparisons between the one-way approach that you advocate and the factorial approach. For the simple comparisons, e.g., E00 vs J00, the two approaches returned pretty much the same results. This comparison is not simple when you fit a factorial model. If you used the contrast you mentioned earlier (something like c(-1,1,0,0,0,...)), then you aren't comparing E00 vs J00. Instead you are comparing the main effects for E and J, which is not the same as comparing E00 and J00 (which is a direct comparison of the average expression for genes at time zero). The main effect for E and J is a comparison of the overall means for the two groups, rather than the means at time zero. When I did the comparison ?average(E00+E01+E05)? vs ?average(E48+E96)? the topTags were pretty much equal. Still, a big fraction (~ 15%) of the significant (FDR<.001) transcripts found in one of the groups, were not found in the other. I suspect that can be explained by including the interactions in the factorial approach and not taking them into account in the one-way approach? No. Like I said before, if you compute equivalent contrasts, you will get identical results regardless of the underlying parameterization. If you are getting differences it is because the contrasts you are computing for the two parameterizations are not equivalent (e.g., you aren't making the same comparisons). This has nothing to do with the presence or absence of interaction terms. Another thing about doing the factorial approach is that you end up with a contrast statement something like: lrt <- glmLRT(fit,contrast=c(0,0,2,2,-3,-3,0,0,0,0)) Without seeing the design matrix and taking the time to interpret the coefficients, I cannot say for sure if this is correct or not. I doubt it is correct, but cannot say for sure. As you note, I did tell Julia Pickl that coefficients for a contrast usually sum to zero. However, if the coefficients being compared are already comparisons themselves, you have to interpret the contrast in light of the underlying coefficients. But this brings us back to the main point I have been trying to make all along. You certainly can use a factorial parameterization for this analysis, but there is no logical reason for doing so. The factorial parameterization is decomposing your data into coefficients that are not really useful in this context, and then you have to do all this work to figure out the interpretation for each coefficient in order to then figure out what contrast you should be using. The alternative is to parameterize your model in a very simple way that you can then easily construct contrasts to make any comparisons you might care to make. Why you want to do things the hard way instead of the easy way eludes me. But don't take my word for it; this is something that Gordon Smyth has been talking about both on this list and in his user's guides for years now. As an example, see section 9.5.4 in the limma User's Guide, where he shows a couple of factorial parameterizations and the required contrasts. And he ends with this sentence: "Since the approaches are equivalent, users are free to choose whichever one is most intuitive or convenient." Best, Jim (the design matrix is slightly different from the one you commented upon below) where the coefficients certainly don?t add up to zero. This comes from the ?algebraic contortions? as you call them. I got the impression from your correspondence with Julia Pickl on this site that contrast coefficients should sum up to zero? But that may only apply to her situation? Finally, I would be pleased to hear thoughts on which approach is statistically ?best?. I?ve been under the impression in statistically analyses that using the entire data set from an experiment is good practice, even if one makes simple pairwise comparison. Does edgeR do that? Thank you. Best, jahn Fra: James W. MacDonald [mailto:jmacdon at uw.edu<mailto:jmacdon at="" uw.edu="">] Sendt: 3. september 2014 23:38 Til: Jahn Davik Kopi: Jahn Davik [guest]; bioconductor at r-project.org<mailto:ioconductor at="" r-project.org=""> Emne: Re: [BioC] edgeR factorial experiment contrasts Hi Jahn, No, it's not wrong. You can parameterize any number of ways, and you will get identical results for equivalent contrasts. But given that you will get identical results, and one way is conceptually easier to figure out, I would in general advocate using the conceptually easier parameterization over a more difficult one that requires lots of algebraic contortions in order to figure out what contrast to use. I believe you sent a truncated version of your experiment (there were no replicates at each time point), so I can't give you explicit code. However, say you have a design matrix that looks something like this: E00 J00 E01 J01 E05 J05 E48 J48 E96 J96 <zeros and="" ones="" go="" here=""> And you want to compare the third contrast you mention below. It's as simple as glmLRT(fit, contrast = c(0.33, 0, 0.33, 0,0.33, 0,-0.5, 0, -0.5, 0)) Which tests the average of the E0-5 time points against the average of the E48-96 time points. And interactions are simple as well. Say you want the interaction between times 0 and 96: glmLRT(fit, contrast = c(1,-1,0,0,0,0,0,0,-1,1)) Simple, no? Best, Jim On Wed, Sep 3, 2014 at 4:10 PM, Jahn Davik <jahn.davik at="" bioforsk.no<mailto:jahn.davik="" at="" bioforsk.no="">> wrote: Hi Jim, Yes, I believe I can see that (except for possibly my third question). It's just that I tend to think along these lines. Are they wrong? jahn ________________________________ Fra: James W. MacDonald [jmacdon at uw.edu<mailto:jmacdon at="" uw.edu="">] Sendt: 3. september 2014 22:06 Til: Jahn Davik [guest] Kopi: bioconductor at r-project.org<mailto:bioconductor at="" r-project.org="">; Jahn Davik Emne: Re: [BioC] edgeR factorial experiment contrasts Hi Jahn, Things will go much easier for you if you just parameterize as cell means rather than a conventional factorial model. grps <- factor(paste(Group, Time, sep = "_")) design <- model.matrix (~0+grps) Then all the comparisons are easy to do. Best, Jim On Wed, Sep 3, 2014 at 12:39 PM, Jahn Davik [guest] <guest at="" bioconductor.org<mailto:guest="" at="" bioconductor.org="">> wrote: Hi, I am trying to figure out how to do comparisons using edgeR but have to admit that I am on unsure ice. My experiment is abiotic stress on two genotypes (E and J) and these have been sampled for RNA-seq at five time points (00, 01, 05, 48, and 96). I would like to analyze this experiment as a factorial (which I am sure could be argued), and when I set up the following design: > design<-model.matrix(~0 + Group + Time + Group*Time), I get the following design matrix (I've taken out the 3 three replicates at experimental cell.): > design GroupE GroupJ Time01 Time05 Time48 Time96 GroupJ:Time01 GroupJ:Time05 GroupJ:Time48 GroupJ:Time96 E00 1 0 0 0 0 0 0 0 0 0 J00 0 1 0 0 0 0 0 0 0 0 E01 1 0 1 0 0 0 0 0 0 0 J01 0 1 1 0 0 0 1 0 0 0 E05 1 0 0 1 0 0 0 0 0 0 J05 0 1 0 1 0 0 0 1 0 0 E48 1 0 0 0 1 0 0 0 0 0 J48 0 1 0 0 1 0 0 0 1 0 E96 1 0 0 0 0 1 0 0 0 0 J96 0 1 0 0 0 1 0 0 0 1 attr(,"assign") [1] 1 1 2 2 2 2 3 3 3 3 attr(,"contrasts") attr(,"contrasts")$Group [1] "contr.treatment" attr(,"contrasts")$Time [1] "contr.treatment" I have some questions regarding testing contrasts: First though, I realize that I can pull out the data for E00 and J00, say, and compare these using exactTest, but then I guess that you are disregarding all information that lies in the remaining data set? That does not seem to me to be a good way to do it. But I may be wrong here? Second, if I want to do the comparison E00 vs J00 using the design indicated above, how should that be set up? Would lrt <- glmLRT(fit,contrast=c(1,-1,0,0,0,0,0,0,0,0)) do it? Third, if I want to compare (E00 + E01 + E05) vs (E48 + E96), how would I set up that contrast? For my own pedagogic reasons I can show how I was thinking in this case: E00 = L1 E01 = L1 + L4 x 2 --> 6L1 + 2L3 + 2L4 E05 = L1 + L4 versus --> 2L3 + 2L4 - 3L5 -3L6 E48 = L1 + L5 x 3 --> 6L1 + 3L5 + 3L6 E96 = L1 + L6 resulting in the contrast: lrt <- glmLRT(fit,contrast=c(0,0,2,2,-3,-3,0,0,0,0)) Then I seem to remember that contrasts should sum up to zero? If anybody could tell me where I am loosing it, I would be very obliged. Thank you jd -- output of sessionInfo(): R version 3.1.0 (2014-04-10) Platform: x86_64-w64-mingw32/x64 (64-bit) locale: [1] LC_COLLATE=Norwegian (Bokm??l)_Norway.1252 LC_CTYPE=Norwegian (Bokm??l)_Norway.1252 [3] LC_MONETARY=Norwegian (Bokm??l)_Norway.1252 LC_NUMERIC=C [5] LC_TIME=Norwegian (Bokm??l)_Norway.1252 attached base packages: [1] splines stats graphics grDevices utils datasets methods base other attached packages: [1] edgeR_3.6.8 limma_3.20.9 BiocInstaller_1.14.2 loaded via a namespace (and not attached): [1] tools_3.1.0 > -- Sent via the guest posting facility at bioconductor.org<http: bioconductor.org="">. _______________________________________________ Bioconductor mailing list Bioconductor at r-project.org<mailto:bioconductor at="" r-project.org=""> https://stat.ethz.ch/mailman/listinfo/bioconductor Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor -- James W. MacDonald, M.S. Biostatistician University of Washington Environmental and Occupational Health Sciences 4225 Roosevelt Way NE, # 100 Seattle WA 98105-6099 -- James W. MacDonald, M.S. Biostatistician University of Washington Environmental and Occupational Health Sciences 4225 Roosevelt Way NE, # 100 Seattle WA 98105-6099 -- James W. MacDonald, M.S. Biostatistician University of Washington Environmental and Occupational Health Sciences 4225 Roosevelt Way NE, # 100 Seattle WA 98105-6099 [[alternative HTML version deleted]]

ADD REPLY • link 9.6 years ago Jahn Davik ▴ 110

0

Entering edit mode

Moreover Jim, And also: what about the interactions? I guess that was one of my issues when I tried to think in traditional factorial terms. To account for them, I mean. But again: I'm in unknown waters here. jahn ________________________________________ Fra: bioconductor-bounces at r-project.org [bioconductor-bounces at r-project.org] på vegne av Jahn Davik [jahn.davik at bioforsk.no] Sendt: 3. september 2014 22:10 Til: jmacdon at u.washington.edu; Jahn Davik [guest] Kopi: bioconductor at r-project.org Emne: Re: [BioC] edgeR factorial experiment contrasts Hi Jim, Yes, I believe I can see that (except for possibly my third question). It's just that I tend to think along these lines. Are they wrong? jahn ________________________________ Fra: James W. MacDonald [jmacdon at uw.edu] Sendt: 3. september 2014 22:06 Til: Jahn Davik [guest] Kopi: bioconductor at r-project.org; Jahn Davik Emne: Re: [BioC] edgeR factorial experiment contrasts Hi Jahn, Things will go much easier for you if you just parameterize as cell means rather than a conventional factorial model. grps <- factor(paste(Group, Time, sep = "_")) design <- model.matrix (~0+grps) Then all the comparisons are easy to do. Best, Jim On Wed, Sep 3, 2014 at 12:39 PM, Jahn Davik [guest] <guest at="" bioconductor.org<mailto:guest="" at="" bioconductor.org="">> wrote: Hi, I am trying to figure out how to do comparisons using edgeR but have to admit that I am on unsure ice. My experiment is abiotic stress on two genotypes (E and J) and these have been sampled for RNA-seq at five time points (00, 01, 05, 48, and 96). I would like to analyze this experiment as a factorial (which I am sure could be argued), and when I set up the following design: > design<-model.matrix(~0 + Group + Time + Group*Time), I get the following design matrix (I've taken out the 3 three replicates at experimental cell.): > design GroupE GroupJ Time01 Time05 Time48 Time96 GroupJ:Time01 GroupJ:Time05 GroupJ:Time48 GroupJ:Time96 E00 1 0 0 0 0 0 0 0 0 0 J00 0 1 0 0 0 0 0 0 0 0 E01 1 0 1 0 0 0 0 0 0 0 J01 0 1 1 0 0 0 1 0 0 0 E05 1 0 0 1 0 0 0 0 0 0 J05 0 1 0 1 0 0 0 1 0 0 E48 1 0 0 0 1 0 0 0 0 0 J48 0 1 0 0 1 0 0 0 1 0 E96 1 0 0 0 0 1 0 0 0 0 J96 0 1 0 0 0 1 0 0 0 1 attr(,"assign") [1] 1 1 2 2 2 2 3 3 3 3 attr(,"contrasts") attr(,"contrasts")$Group [1] "contr.treatment" attr(,"contrasts")$Time [1] "contr.treatment" I have some questions regarding testing contrasts: First though, I realize that I can pull out the data for E00 and J00, say, and compare these using exactTest, but then I guess that you are disregarding all information that lies in the remaining data set? That does not seem to me to be a good way to do it. But I may be wrong here? Second, if I want to do the comparison E00 vs J00 using the design indicated above, how should that be set up? Would lrt <- glmLRT(fit,contrast=c(1,-1,0,0,0,0,0,0,0,0)) do it? Third, if I want to compare (E00 + E01 + E05) vs (E48 + E96), how would I set up that contrast? For my own pedagogic reasons I can show how I was thinking in this case: E00 = L1 E01 = L1 + L4 x 2 --> 6L1 + 2L3 + 2L4 E05 = L1 + L4 versus --> 2L3 + 2L4 - 3L5 -3L6 E48 = L1 + L5 x 3 --> 6L1 + 3L5 + 3L6 E96 = L1 + L6 resulting in the contrast: lrt <- glmLRT(fit,contrast=c(0,0,2,2,-3,-3,0,0,0,0)) Then I seem to remember that contrasts should sum up to zero? If anybody could tell me where I am loosing it, I would be very obliged. Thank you jd -- output of sessionInfo(): R version 3.1.0 (2014-04-10) Platform: x86_64-w64-mingw32/x64 (64-bit) locale: [1] LC_COLLATE=Norwegian (Bokm?l)_Norway.1252 LC_CTYPE=Norwegian (Bokm?l)_Norway.1252 [3] LC_MONETARY=Norwegian (Bokm?l)_Norway.1252 LC_NUMERIC=C [5] LC_TIME=Norwegian (Bokm?l)_Norway.1252 attached base packages: [1] splines stats graphics grDevices utils datasets methods base other attached packages: [1] edgeR_3.6.8 limma_3.20.9 BiocInstaller_1.14.2 loaded via a namespace (and not attached): [1] tools_3.1.0 > -- Sent via the guest posting facility at bioconductor.org<http: bioconductor.org="">. _______________________________________________ Bioconductor mailing list Bioconductor at r-project.org<mailto:bioconductor at="" r-project.org=""> https://stat.ethz.ch/mailman/listinfo/bioconductor Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor -- James W. MacDonald, M.S. Biostatistician University of Washington Environmental and Occupational Health Sciences 4225 Roosevelt Way NE, # 100 Seattle WA 98105-6099 [[alternative HTML version deleted]] _______________________________________________ Bioconductor mailing list Bioconductor at r-project.org https://stat.ethz.ch/mailman/listinfo/bioconductor Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor

ADD REPLY • link 9.7 years ago Jahn Davik ▴ 110

Login before adding your answer.