Search
Question: Defining the model matrix for a multi-factorial design with interactions - edgeR
1
4.4 years ago by
Guest User12k
Guest User12k wrote:
Hi all! I have a question about the way to define the model matrix for a multi-factorial design in which I want to model all the interactions. In particular, I have a 3x3x2 factorial design (3 factors) and three biological replicates, so I have 3x3x2x3 = 54 sample. The three factors are: - GENOTYPE with the three levels F, G and P - LOAD with levels HIGH, MEDIUM and LOW - TIME with levels H and PH So this is the design matrix: > design Time Load Genotype FHH_1 H High F FHPH_1 PH High F FHH_2 H High F FHPH_2 PH High F FHH_3 H High F FHPH_3 PH High F FMH_1 H Medium F FMPH_1 PH Medium F FMH_2 H Medium F FMPH_2 PH Medium F FMH_3 H Medium F FMPH_3 PH Medium F FLH_1 H Low F FLPH_1 PH Low F FLH_2 H Low F FLPH_2 PH Low F FLH_3 H Low F FLPH_3 PH Low F GHH_1 H High G GHPH_1 PH High G GHH_2 H High G GHPH_2 PH High G GHH_3 H High G GHPH_3 PH High G GMH_1 H Medium G GMPH_1 PH Medium G GMH_2 H Medium G GMPH_2 PH Medium G GMH_3 H Medium G GMPH_3 PH Medium G GLH_1 H Low G GLPH_1 PH Low G GLH_2 H Low G GLPH_2 PH Low G GLH_3 H Low G GLPH_3 PH Low G PHH_1 H High P PHPH_1 PH High P PHH_2 H High P PHPH_2 PH High P PHH_3 H High P PHPH_3 PH High P PMH_1 H Medium P PMPH_1 PH Medium P PMH_2 H Medium P PMPH_2 PH Medium P PMH_3 H Medium P PMPH_3 PH Medium P PLH_1 H Low P PLPH_1 PH Low P PLH_2 H Low P PLPH_2 PH Low P PLH_3 H Low P PLPH_3 PH Low P I am interested in finding all possible interactions between the three factors, so I want to model also the three way interactions GENOTYPE:LOAD:TIME. I have already read the edgeR vignette so I was wondering if the best way to define the model matrix is by defining it with a formula or by defining each treatment combination as a group. I'm trying to understand which are the differences between the two alternatives, so I tried to estimate both the models: Model 1) mydesign1<-model.matrix(~Genotype * Load * Time, data=design) D1 <- estimateGLMCommonDisp(D, mydesign1) # D is the DGEList of the normalized counts D1 <- estimateGLMTrendedDisp(D1, mydesign1) fit1 <- glmFit(D1, mydesign1) The estimated coefficients are: > colnames(fit1) [1] "(Intercept)" "GenotypeG" [3] "GenotypeP" "LoadLow" [5] "LoadMedium" "TimePH" [7] "GenotypeG:LoadLow" "GenotypeP:LoadLow" [9] "GenotypeG:LoadMedium" "GenotypeP:LoadMedium" [11] "GenotypeG:TimePH" "GenotypeP:TimePH" [13] "LoadLow:TimePH" "LoadMedium:TimePH" [15] "GenotypeG:LoadLow:TimePH" "GenotypeP:LoadLow:TimePH" [17] "GenotypeG:LoadMedium:TimePH" "GenotypeP:LoadMedium:TimePH" Model 2) Group <- factor(paste(design$Genotype, design$Load, design$Time, sep=".")) # I have 18 unique levels in Group (54 samples / 3 biological replicates) mydesign2<-model.matrix(~0+Group) colnames(mydesign2) <- levels(Group) D2 <- estimateGLMCommonDisp(D, mydesign2) D2 <- estimateGLMTrendedDisp(D2, mydesign2) fit2<-glmFit(D2, mydesign2) > colnames(fit2) [1] "H.High.F" "H.High.G" "H.High.P" "H.Low.F" "H.Low.G" [6] "H.Low.P" "H.Medium.F" "H.Medium.G" "H.Medium.P" "PH.High.F" [11] "PH.High.G" "PH.High.P" "PH.Low.F" "PH.Low.G" "PH.Low.P" [16] "PH.Medium.F" "PH.Medium.G" "PH.Medium.P" Now, to understand the differences between the two models, my question was this: if I define the contrast: contrast1<-c(rep(0,14), 1, 0, -1, 0) # "GenotypeG:LoadLow:TimePH" vs "GenotypeG:LoadMedium:TimePH" on the first model and I make a LR test, will I obtain the same result as making a LR test on the contrast: contrast2<-makeContrastsG.LowvsMedium.PH = PH.Low.G - PH.Medium.G, levels=mydesign2) on the second model? This is the code I used to make the tests: lrt1.G.LowvsMedium.PH<-glmLRT(fit1, contrast=contrast1) summarydecideTestsDGElrt1.G.LowvsMedium.PH)) and lrt2.G.LowvsMedium.PH<-glmLRT(fit2, contrast=contrast2[," G.LowvsMedium.PH"]) summarydecideTestsDGElrt2.G.LowvsMedium.PH)) The answer to the question is no, because I've tried to do it on my data and the results are different between the two tests. But why? Am I doing two completely different tests? The two parameters of the two models aren't saying me the same thing? In a moment of madness :) I tried to estimate another model, with only the three-level interactions between the factors. Model 3) mydesign3<-model.matrix(~0+Genotype:Load:Time, data=design) D3 <- estimateGLMCommonDisp(D, mydesign3) D3 <- estimateGLMTrendedDisp(D3, mydesign3) fit3<-glmFit(D3, mydesign3) > colnames(fit3) [1] "GenotypeF:LoadHigh:TimeH" "GenotypeG:LoadHigh:TimeH" [3] "GenotypeP:LoadHigh:TimeH" "GenotypeF:LoadLow:TimeH" [5] "GenotypeG:LoadLow:TimeH" "GenotypeP:LoadLow:TimeH" [7] "GenotypeF:LoadMedium:TimeH" "GenotypeG:LoadMedium:TimeH" [9] "GenotypeP:LoadMedium:TimeH" "GenotypeF:LoadHigh:TimePH" [11] "GenotypeG:LoadHigh:TimePH" "GenotypeP:LoadHigh:TimePH" [13] "GenotypeF:LoadLow:TimePH" "GenotypeG:LoadLow:TimePH" [15] "GenotypeP:LoadLow:TimePH" "GenotypeF:LoadMedium:TimePH" [17] "GenotypeG:LoadMedium:TimePH" "GenotypeP:LoadMedium:TimePH" Well, this model gave me the same parameter estimates as the model 2 (each treatment combination as a group) and, obviously, the same results in terms of DEG. But is it statistically correct to estimate a model with three-level interactions without any principal effect? Thanks in advance for any replies! Marco -- output of sessionInfo(): R version 3.0.2 (2013-09-25) Platform: x86_64-w64-mingw32/x64 (64-bit) locale: [1] LC_COLLATE=Italian_Italy.1252 LC_CTYPE=Italian_Italy.1252 [3] LC_MONETARY=Italian_Italy.1252 LC_NUMERIC=C [5] LC_TIME=Italian_Italy.1252 attached base packages: [1] splines parallel stats graphics grDevices utils datasets [8] methods base other attached packages: [1] EDASeq_1.8.0 ShortRead_1.20.0 Rsamtools_1.14.2 [4] lattice_0.20-24 Biostrings_2.30.1 GenomicRanges_1.14.4 [7] XVector_0.2.0 IRanges_1.20.6 Biobase_2.22.0 [10] BiocGenerics_0.8.0 edgeR_3.4.2 limma_3.18.9 [13] aroma.light_1.32.0 matrixStats_0.8.14 yeastRNASeq_0.0.10 loaded via a namespace (and not attached): [1] annotate_1.40.0 AnnotationDbi_1.24.0 bitops_1.0-6 [4] DBI_0.2-7 DESeq_1.14.0 genefilter_1.44.0 [7] geneplotter_1.40.0 grid_3.0.2 hwriter_1.3 [10] latticeExtra_0.6-26 R.methodsS3_1.6.1 R.oo_1.17.0 [13] RColorBrewer_1.0-5 RSQLite_0.11.4 stats4_3.0.2 [16] survival_2.37-6 tools_3.0.2 XML_3.98-1.1 [19] xtable_1.7-1 zlibbioc_1.8.0 -- Sent via the guest posting facility at bioconductor.org. ADD COMMENTlink modified 4.4 years ago by Ryan C. Thompson6.8k • written 4.4 years ago by Guest User12k 0 4.4 years ago by The Scripps Research Institute, La Jolla, CA Ryan C. Thompson6.8k wrote: Hi Marco, Note that your models 2 & 3 are exactly the same model. The expression in your formula for model 3: Genotype:Load:Time is more or less equivalent to: factor(paste(design$Genotype, design$Load, design$Time, sep=":")) except that the factor levels might possibly be in a different order. As for your question about doing the same test with Models 1 & 2, I believe that both models are equivalent parametrizations of the same design, so you should get (nearly) identical dispersion estimates, and performing equivalent tests with the two models should give you the same p-values. If you are not getting the same p-values, then you have not properly set up the contrast for Model 1. This isn't surprising, because a 3-way interaction model is really confusing to work with, and I can't even tell you what the correct contrast would be. I would recommend that you stick with Model 2 (or 3), where each coefficient has a direct biological interpretation as the estimated expression level of a group. Designing contrasts for this parametrization is much easier, in my opinion. I think you would agree, since you got this contrast correct in your example code. -Ryan On 1/29/14, 7:11 AM, Marco [guest] wrote: > Hi all! > > I have a question about the way to define the model matrix for a multi-factorial design in which I want to model all the interactions. In particular, I have a 3x3x2 factorial design (3 factors) and three biological replicates, so I have 3x3x2x3 = 54 sample. > The three factors are: > - GENOTYPE with the three levels F, G and P > - LOAD with levels HIGH, MEDIUM and LOW > - TIME with levels H and PH > > So this is the design matrix: > >> design > Time Load Genotype > FHH_1 H High F > FHPH_1 PH High F > FHH_2 H High F > FHPH_2 PH High F > FHH_3 H High F > FHPH_3 PH High F > FMH_1 H Medium F > FMPH_1 PH Medium F > FMH_2 H Medium F > FMPH_2 PH Medium F > FMH_3 H Medium F > FMPH_3 PH Medium F > FLH_1 H Low F > FLPH_1 PH Low F > FLH_2 H Low F > FLPH_2 PH Low F > FLH_3 H Low F > FLPH_3 PH Low F > GHH_1 H High G > GHPH_1 PH High G > GHH_2 H High G > GHPH_2 PH High G > GHH_3 H High G > GHPH_3 PH High G > GMH_1 H Medium G > GMPH_1 PH Medium G > GMH_2 H Medium G > GMPH_2 PH Medium G > GMH_3 H Medium G > GMPH_3 PH Medium G > GLH_1 H Low G > GLPH_1 PH Low G > GLH_2 H Low G > GLPH_2 PH Low G > GLH_3 H Low G > GLPH_3 PH Low G > PHH_1 H High P > PHPH_1 PH High P > PHH_2 H High P > PHPH_2 PH High P > PHH_3 H High P > PHPH_3 PH High P > PMH_1 H Medium P > PMPH_1 PH Medium P > PMH_2 H Medium P > PMPH_2 PH Medium P > PMH_3 H Medium P > PMPH_3 PH Medium P > PLH_1 H Low P > PLPH_1 PH Low P > PLH_2 H Low P > PLPH_2 PH Low P > PLH_3 H Low P > PLPH_3 PH Low P > > > I am interested in finding all possible interactions between the three factors, so I want to model also the three way interactions GENOTYPE:LOAD:TIME. I have already read the edgeR vignette so I was wondering if the best way to define the model matrix is by defining it with a formula or by defining each treatment combination as a group. > I'm trying to understand which are the differences between the two alternatives, so I tried to estimate both the models: > > Model 1) > mydesign1<-model.matrix(~Genotype * Load * Time, data=design) > D1 <- estimateGLMCommonDisp(D, mydesign1) # D is the DGEList of the normalized counts > D1 <- estimateGLMTrendedDisp(D1, mydesign1) > fit1 <- glmFit(D1, mydesign1) > > The estimated coefficients are: > > colnames(fit1) > [1] "(Intercept)" "GenotypeG" > [3] "GenotypeP" "LoadLow" > [5] "LoadMedium" "TimePH" > [7] "GenotypeG:LoadLow" "GenotypeP:LoadLow" > [9] "GenotypeG:LoadMedium" "GenotypeP:LoadMedium" > [11] "GenotypeG:TimePH" "GenotypeP:TimePH" > [13] "LoadLow:TimePH" "LoadMedium:TimePH" > [15] "GenotypeG:LoadLow:TimePH" "GenotypeP:LoadLow:TimePH" > [17] "GenotypeG:LoadMedium:TimePH" "GenotypeP:LoadMedium:TimePH" > > > Model 2) > Group <- factor(paste(design$Genotype, design$Load, design$Time, sep=".")) > # I have 18 unique levels in Group (54 samples / 3 biological replicates) > > mydesign2<-model.matrix(~0+Group) > colnames(mydesign2) <- levels(Group) > > D2 <- estimateGLMCommonDisp(D, mydesign2) > D2 <- estimateGLMTrendedDisp(D2, mydesign2) > > fit2<-glmFit(D2, mydesign2) > > colnames(fit2) > [1] "H.High.F" "H.High.G" "H.High.P" "H.Low.F" "H.Low.G" > [6] "H.Low.P" "H.Medium.F" "H.Medium.G" "H.Medium.P" "PH.High.F" > [11] "PH.High.G" "PH.High.P" "PH.Low.F" "PH.Low.G" "PH.Low.P" > [16] "PH.Medium.F" "PH.Medium.G" "PH.Medium.P" > > Now, to understand the differences between the two models, my question was this: > if I define the contrast: > > contrast1<-c(rep(0,14), 1, 0, -1, 0) # "GenotypeG:LoadLow:TimePH" vs "GenotypeG:LoadMedium:TimePH" > > on the first model and I make a LR test, will I obtain the same result as making a LR test on the contrast: > > contrast2<-makeContrastsG.LowvsMedium.PH = PH.Low.G - PH.Medium.G, levels=mydesign2) > > on the second model? > This is the code I used to make the tests: > > lrt1.G.LowvsMedium.PH<-glmLRT(fit1, contrast=contrast1) > summary(decideTestsDGElrt1.G.LowvsMedium.PH)) > > and > > lrt2.G.LowvsMedium.PH<-glmLRT(fit2, contrast=contrast2[," G.LowvsMedium.PH"]) > summary(decideTestsDGElrt2.G.LowvsMedium.PH)) > > The answer to the question is no, because I've tried to do it on my data and the results are different between the two tests. But why? Am I doing two completely different tests? The two parameters of the two models aren't saying me the same thing? > > In a moment of madness :) I tried to estimate another model, with only the three-level interactions between the factors. > > Model 3) > mydesign3<-model.matrix(~0+Genotype:Load:Time, data=design) > > D3 <- estimateGLMCommonDisp(D, mydesign3) > D3 <- estimateGLMTrendedDisp(D3, mydesign3) > > fit3<-glmFit(D3, mydesign3) > > colnames(fit3) > [1] "GenotypeF:LoadHigh:TimeH" "GenotypeG:LoadHigh:TimeH" > [3] "GenotypeP:LoadHigh:TimeH" "GenotypeF:LoadLow:TimeH" > [5] "GenotypeG:LoadLow:TimeH" "GenotypeP:LoadLow:TimeH" > [7] "GenotypeF:LoadMedium:TimeH" "GenotypeG:LoadMedium:TimeH" > [9] "GenotypeP:LoadMedium:TimeH" "GenotypeF:LoadHigh:TimePH" > [11] "GenotypeG:LoadHigh:TimePH" "GenotypeP:LoadHigh:TimePH" > [13] "GenotypeF:LoadLow:TimePH" "GenotypeG:LoadLow:TimePH" > [15] "GenotypeP:LoadLow:TimePH" "GenotypeF:LoadMedium:TimePH" > [17] "GenotypeG:LoadMedium:TimePH" "GenotypeP:LoadMedium:TimePH" > > Well, this model gave me the same parameter estimates as the model 2 (each treatment combination as a group) and, obviously, the same results in terms of DEG. But is it statistically correct to estimate a model with three-level interactions without any principal effect? > > Thanks in advance for any replies! > > Marco > > -- output of sessionInfo(): > > R version 3.0.2 (2013-09-25) > Platform: x86_64-w64-mingw32/x64 (64-bit) > > locale: > [1] LC_COLLATE=Italian_Italy.1252 LC_CTYPE=Italian_Italy.1252 > [3] LC_MONETARY=Italian_Italy.1252 LC_NUMERIC=C > [5] LC_TIME=Italian_Italy.1252 > > attached base packages: > [1] splines parallel stats graphics grDevices utils datasets > [8] methods base > > other attached packages: > [1] EDASeq_1.8.0 ShortRead_1.20.0 Rsamtools_1.14.2 > [4] lattice_0.20-24 Biostrings_2.30.1 GenomicRanges_1.14.4 > [7] XVector_0.2.0 IRanges_1.20.6 Biobase_2.22.0 > [10] BiocGenerics_0.8.0 edgeR_3.4.2 limma_3.18.9 > [13] aroma.light_1.32.0 matrixStats_0.8.14 yeastRNASeq_0.0.10 > > loaded via a namespace (and not attached): > [1] annotate_1.40.0 AnnotationDbi_1.24.0 bitops_1.0-6 > [4] DBI_0.2-7 DESeq_1.14.0 genefilter_1.44.0 > [7] geneplotter_1.40.0 grid_3.0.2 hwriter_1.3 > [10] latticeExtra_0.6-26 R.methodsS3_1.6.1 R.oo_1.17.0 > [13] RColorBrewer_1.0-5 RSQLite_0.11.4 stats4_3.0.2 > [16] survival_2.37-6 tools_3.0.2 XML_3.98-1.1 > [19] xtable_1.7-1 zlibbioc_1.8.0 > > -- > Sent via the guest posting facility at bioconductor.org. > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor ADD COMMENTlink written 4.4 years ago by Ryan C. Thompson6.8k Dear Ryan, thank you very much for your reply. I agree with you about the easier interpretation of the parametrization of Model 2. About the differences between Models 1 and 2, you are right, the dispersion parameters estimates are nearly identical but the p-values are very different between the tests. Probably, as you told me, the reason is I make some mistakes in the definition of the contrast. Thank you again. Have a good day, Marco 2014-01-29 Ryan <rct@thompsonclan.org> > Hi Marco, > > Note that your models 2 & 3 are exactly the same model. The expression in > your formula for model 3: > > Genotype:Load:Time > > is more or less equivalent to: > > factor(paste(design$Genotype, design$Load, design$Time, sep=":")) > > except that the factor levels might possibly be in a different order. As > for your question about doing the same test with Models 1 & 2, I believe > that both models are equivalent parametrizations of the same design, so you > should get (nearly) identical dispersion estimates, and performing > equivalent tests with the two models should give you the same p-values. If > you are not getting the same p-values, then you have not properly set up > the contrast for Model 1. This isn't surprising, because a 3-way > interaction model is really confusing to work with, and I can't even tell > you what the correct contrast would be. I would recommend that you stick > with Model 2 (or 3), where each coefficient has a direct biological > interpretation as the estimated expression level of a group. Designing > contrasts for this parametrization is much easier, in my opinion. I think > you would agree, since you got this contrast correct in your example code. > > -Ryan > > > On 1/29/14, 7:11 AM, Marco [guest] wrote: > >> Hi all! >> >> I have a question about the way to define the model matrix for a >> multi-factorial design in which I want to model all the interactions. In >> particular, I have a 3x3x2 factorial design (3 factors) and three >> biological replicates, so I have 3x3x2x3 = 54 sample. >> The three factors are: >> - GENOTYPE with the three levels F, G and P >> - LOAD with levels HIGH, MEDIUM and LOW >> - TIME with levels H and PH >> >> So this is the design matrix: >> >> design >>> >> Time Load Genotype >> FHH_1 H High F >> FHPH_1 PH High F >> FHH_2 H High F >> FHPH_2 PH High F >> FHH_3 H High F >> FHPH_3 PH High F >> FMH_1 H Medium F >> FMPH_1 PH Medium F >> FMH_2 H Medium F >> FMPH_2 PH Medium F >> FMH_3 H Medium F >> FMPH_3 PH Medium F >> FLH_1 H Low F >> FLPH_1 PH Low F >> FLH_2 H Low F >> FLPH_2 PH Low F >> FLH_3 H Low F >> FLPH_3 PH Low F >> GHH_1 H High G >> GHPH_1 PH High G >> GHH_2 H High G >> GHPH_2 PH High G >> GHH_3 H High G >> GHPH_3 PH High G >> GMH_1 H Medium G >> GMPH_1 PH Medium G >> GMH_2 H Medium G >> GMPH_2 PH Medium G >> GMH_3 H Medium G >> GMPH_3 PH Medium G >> GLH_1 H Low G >> GLPH_1 PH Low G >> GLH_2 H Low G >> GLPH_2 PH Low G >> GLH_3 H Low G >> GLPH_3 PH Low G >> PHH_1 H High P >> PHPH_1 PH High P >> PHH_2 H High P >> PHPH_2 PH High P >> PHH_3 H High P >> PHPH_3 PH High P >> PMH_1 H Medium P >> PMPH_1 PH Medium P >> PMH_2 H Medium P >> PMPH_2 PH Medium P >> PMH_3 H Medium P >> PMPH_3 PH Medium P >> PLH_1 H Low P >> PLPH_1 PH Low P >> PLH_2 H Low P >> PLPH_2 PH Low P >> PLH_3 H Low P >> PLPH_3 PH Low P >> >> >> I am interested in finding all possible interactions between the three >> factors, so I want to model also the three way interactions >> GENOTYPE:LOAD:TIME. I have already read the edgeR vignette so I was >> wondering if the best way to define the model matrix is by defining it with >> a formula or by defining each treatment combination as a group. >> I'm trying to understand which are the differences between the two >> alternatives, so I tried to estimate both the models: >> >> Model 1) >> mydesign1<-model.matrix(~Genotype * Load * Time, data=design) >> D1 <- estimateGLMCommonDisp(D, mydesign1) # D is the DGEList of the >> normalized counts >> D1 <- estimateGLMTrendedDisp(D1, mydesign1) >> fit1 <- glmFit(D1, mydesign1) >> >> The estimated coefficients are: >> > colnames(fit1) >> [1] "(Intercept)" "GenotypeG" >> [3] "GenotypeP" "LoadLow" >> [5] "LoadMedium" "TimePH" >> [7] "GenotypeG:LoadLow" "GenotypeP:LoadLow" >> [9] "GenotypeG:LoadMedium" "GenotypeP:LoadMedium" >> [11] "GenotypeG:TimePH" "GenotypeP:TimePH" >> [13] "LoadLow:TimePH" "LoadMedium:TimePH" >> [15] "GenotypeG:LoadLow:TimePH" "GenotypeP:LoadLow:TimePH" >> [17] "GenotypeG:LoadMedium:TimePH" "GenotypeP:LoadMedium:TimePH" >> Model 2) >> Group <- factor(paste(design$Genotype, design$Load, design\$Time, >> sep=".")) >> # I have 18 unique levels in Group (54 samples / 3 biological >> replicates) >> >> mydesign2<-model.matrix(~0+Group) >> colnames(mydesign2) <- levels(Group) >> >> D2 <- estimateGLMCommonDisp(D, mydesign2) >> D2 <- estimateGLMTrendedDisp(D2, mydesign2) >> >> fit2<-glmFit(D2, mydesign2) >> > colnames(fit2) >> [1] "H.High.F" "H.High.G" "H.High.P" "H.Low.F" >> "H.Low.G" >> [6] "H.Low.P" "H.Medium.F" "H.Medium.G" "H.Medium.P" >> "PH.High.F" >> [11] "PH.High.G" "PH.High.P" "PH.Low.F" "PH.Low.G" >> "PH.Low.P" >> [16] "PH.Medium.F" "PH.Medium.G" "PH.Medium.P" >> Now, to understand the differences between the two models, my question >> was this: >> if I define the contrast: >> >> contrast1<-c(rep(0,14), 1, 0, -1, 0) # "GenotypeG:LoadLow:TimePH" vs >> "GenotypeG:LoadMedium:TimePH" >> >> on the first model and I make a LR test, will I obtain the same result as >> making a LR test on the contrast: >> >> contrast2<-makeContrastsG.LowvsMedium.PH = PH.Low.G - PH.Medium.G, >> levels=mydesign2) >> >> on the second model? >> This is the code I used to make the tests: >> >> lrt1.G.LowvsMedium.PH<-glmLRT(fit1, contrast=contrast1) >> summary(decideTestsDGElrt1.G.LowvsMedium.PH)) >> and >> >> lrt2.G.LowvsMedium.PH<-glmLRT(fit2, contrast=contrast2[," >> G.LowvsMedium.PH"]) >> summary(decideTestsDGElrt2.G.LowvsMedium.PH)) >> >> The answer to the question is no, because I've tried to do it on my data >> and the results are different between the two tests. But why? Am I doing >> two completely different tests? The two parameters of the two models aren't >> saying me the same thing? >> >> In a moment of madness :) I tried to estimate another model, with only >> the three-level interactions between the factors. >> >> Model 3) >> mydesign3<-model.matrix(~0+Genotype:Load:Time, data=design) >> >> D3 <- estimateGLMCommonDisp(D, mydesign3) >> D3 <- estimateGLMTrendedDisp(D3, mydesign3) >> >> fit3<-glmFit(D3, mydesign3) >> > colnames(fit3) >> [1] "GenotypeF:LoadHigh:TimeH" "GenotypeG:LoadHigh:TimeH" >> [3] "GenotypeP:LoadHigh:TimeH" "GenotypeF:LoadLow:TimeH" >> [5] "GenotypeG:LoadLow:TimeH" "GenotypeP:LoadLow:TimeH" >> [7] "GenotypeF:LoadMedium:TimeH" "GenotypeG:LoadMedium:TimeH" >> [9] "GenotypeP:LoadMedium:TimeH" "GenotypeF:LoadHigh:TimePH" >> [11] "GenotypeG:LoadHigh:TimePH" "GenotypeP:LoadHigh:TimePH" >> [13] "GenotypeF:LoadLow:TimePH" "GenotypeG:LoadLow:TimePH" >> [15] "GenotypeP:LoadLow:TimePH" "GenotypeF:LoadMedium:TimePH" >> [17] "GenotypeG:LoadMedium:TimePH" "GenotypeP:LoadMedium:TimePH" >> Well, this model gave me the same parameter estimates as the model 2 >> (each treatment combination as a group) and, obviously, the same results in >> terms of DEG. But is it statistically correct to estimate a model with >> three-level interactions without any principal effect? >> >> Thanks in advance for any replies! >> >> Marco >> >> -- output of sessionInfo(): >> >> R version 3.0.2 (2013-09-25) >> Platform: x86_64-w64-mingw32/x64 (64-bit) >> >> locale: >> [1] LC_COLLATE=Italian_Italy.1252 LC_CTYPE=Italian_Italy.1252 >> [3] LC_MONETARY=Italian_Italy.1252 LC_NUMERIC=C >> [5] LC_TIME=Italian_Italy.1252 >> >> attached base packages: >> [1] splines parallel stats graphics grDevices utils datasets >> [8] methods base >> >> other attached packages: >> [1] EDASeq_1.8.0 ShortRead_1.20.0 Rsamtools_1.14.2 >> [4] lattice_0.20-24 Biostrings_2.30.1 GenomicRanges_1.14.4 >> [7] XVector_0.2.0 IRanges_1.20.6 Biobase_2.22.0 >> [10] BiocGenerics_0.8.0 edgeR_3.4.2 limma_3.18.9 >> [13] aroma.light_1.32.0 matrixStats_0.8.14 yeastRNASeq_0.0.10 >> >> loaded via a namespace (and not attached): >> [1] annotate_1.40.0 AnnotationDbi_1.24.0 bitops_1.0-6 >> [4] DBI_0.2-7 DESeq_1.14.0 genefilter_1.44.0 >> [7] geneplotter_1.40.0 grid_3.0.2 hwriter_1.3 >> [10] latticeExtra_0.6-26 R.methodsS3_1.6.1 R.oo_1.17.0 >> [13] RColorBrewer_1.0-5 RSQLite_0.11.4 stats4_3.0.2 >> [16] survival_2.37-6 tools_3.0.2 XML_3.98-1.1 >> [19] xtable_1.7-1 zlibbioc_1.8.0 >> >> -- >> Sent via the guest posting facility at bioconductor.org. >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor@r-project.org >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: http://news.gmane.org/gmane. >> science.biology.informatics.conductor >> > > [[alternative HTML version deleted]]