Model Design
1
0
Entering edit mode
@james-w-macdonald-5106
Last seen 10 hours ago
United States
Hi Sander, Please don't take conversations off-list (e.g., use 'Reply all' when responding). On Mon, Sep 1, 2014 at 10:59 AM, Sander van der Zeeuw < s.a.j.van_der_zeeuw at lumc.nl> wrote: > Hi Jim, > > thanks for the information, i have been struggling for a few days now > trying to implement your tips. If i want to compare the 8 samples divided > into 4 subjects with no treatment or control or what so ever. I have only > one model which produces a reasonable result. I just do not know what > exactly is happening. This is the contrast and design model i have been > using: > This is a problem to start with. I have no idea what 'i want to compare the 8 samples divided into 4 subjects with no treatment or control or what so ever' means. You will need to be more clear on what your goals are for me (or anybody else) to be able to help you. contrast <- makeContrasts(subject1-subject2+subject3-subject4, levels = > design); > Contrasts > Levels subject1 - subject2 + subject3 - subject4 > subject1 1 > subject2 -1 > subject3 1 > subject4 -1 > > This contrast is testing an interaction between subjects 1 and 2 versus subjects 4 and 3. Without knowing what the subjects are, I cannot say if this is a reasonable thing to be doing. But please note two things: 1.) Any comparison in ANOVA is just simple algebra, so if you have taken algebra, you should be able to figure out what the comparison is. 2.) Given 1.), we can decompose your contrast and see what it is testing: Your contrast is subject1 - subject2 + subject3 - subject4 we can rewrite that as (subject1 - subject2) - (subject4 - subject3) so at a certain level, what you are doing is computing the difference between subjects 1 and 2, and then comparing that to the difference between 4 and 3. If the value in the first parenthesis is pretty close to the second, then you won't have much evidence for a difference. Alternatively, if the values in the two parentheses are different (and larger than expected given the intra-group variability), you will likely reject the null hypothesis and say there appears to be a difference. But what does this mean? In your case, I have no idea. In addition, I have no idea if you are really (or should be) looking for an interaction here. Which gets us back to my first point; what exactly (and I mean EXACTLY) are you trying to do? If you simply want to do all possible comparisons, then you just set it up in the most obvious way: contrast <- makeContrasts(subject1-subject2, subject1-subject3, subject1-subject4, subject2-subject3, subject2-subject4, subject3-subject4, levels = design) Best, Jim > > design <- model.matrix(~ 0 + subject, data = group ); > > design > > subject1 subject2 subject3 subject4 > S7309 1 0 0 0 > S7310 1 0 0 0 > S7311 0 1 0 0 > S7312 0 1 0 0 > S7313 0 0 1 0 > S7314 0 0 1 0 > S7315 0 0 0 1 > S7316 0 0 0 1 > attr(,"assign") > [1] 1 1 1 1 > attr(,"contrasts") > attr(,"contrasts")$subject > [1] "contr.treatment" > > Do you think that this will be the correct setup now? Because i doubt this > strongly. If you have any interesting tips please give them to me. > > Best, > > regards, > > Sander > > > > On 08/21/2014 04:12 PM, James W. MacDonald wrote: > > Hi Sander, > > On Aug 21, 2014 8:59 AM, "Sander [guest]" <guest at="" bioconductor.org=""> wrote: > > > > Dear edgeR maintainers, > > > > I have some troubles setting up the correct design for my DE experiment. > I now how my experiment design should look like and that is like this: > > > design > > subject1 subject2 subject3 subject4 > > S7309 1 0 0 0 > > S7310 1 0 0 0 > > S7311 0 1 0 0 > > S7312 0 1 0 0 > > S7313 0 0 1 0 > > S7314 0 0 1 0 > > S7315 0 0 0 1 > > S7316 0 0 0 1 > > attr(,"assign") > > [1] 1 1 1 1 > > attr(,"contrasts") > > attr(,"contrasts")$subject > > [1] "contr.treatment" > > > > After that i make my contrasts: > > > makeContrasts(subject1,subject2,subject3,subject4, levels = design) > > This is where you have made a mistake. Toy should be subtracting one > subject from another, depending on the contrasts you care about. > > The resulting contrasts matrix should have both positive and negative > values, not just zeros and ones. > > See the edgeR users guide or limma users guide for examples. > > Best, > > Jim > > > Contrasts > > Levels subject1 subject2 subject3 subject4 > > subject1 1 0 0 0 > > subject2 0 1 0 0 > > subject3 0 0 1 0 > > subject4 0 0 0 1 > > > > This all looks right since i need to compare the subjects with each > other. But when i run my analysis function which looks like this: > > > > library( edgeR ); > > library( ggplot2 ); > > library( reshape ); > > library( FactoMineR ); > > > > analyse <- function( counts, design, contrast, name, style ) { > > counts <- counts[ rowSums( counts, na.rm = TRUE ) > 0, ]; > > y <- DGEList( counts = counts, genes = rownames( counts ) ); > > y <- calcNormFactors( y ); > > y <- estimateGLMCommonDisp( y, design ); > > y <- estimateGLMTrendedDisp( y, design, df = 5 ); > > y <- estimateGLMTagwiseDisp( y, design ); > > > > fit <- glmFit( y, design ); > > lrt <- glmLRT( fit, contrast = contrast ); > > de <- decideTestsDGE( lrt, p = 0.05, adjust = "BH" ); > > cpmY <- cpm( y ); > > > > daf <- designAsFactor( design ); > > orderedDesign <- design[ order( daf, names( daf ) ), ]; > > tab <- data.frame( > > row.names = rownames( cpmY ), > > genes = rownames( cpmY ), > > de = de, > > cpmY[ ,order( daf, names( daf ) ) ] > > ); > > > > aRepTab <- topTags( lrt, n = nrow( counts ) )$table; > > aRepTab$rank <- 1:nrow( counts ); > > # repTab <- tab[ match( aRepTab$genes, rownames( tab ) ), ]; > > > > repTab <- merge( aRepTab, tab, by = "genes", sort = FALSE ); > > repTab <- repTab[ order( repTab$rank ), ]; > > # data.frame( > > # row.names = rownames( aRepTab ), > > # aRepTab, > > # tab[ match( aRepTab$genes, tab$genes ), ] > > # ) > > > > list( > > name = name, > > y = y, > > fit = fit, > > lrt = lrt, > > de = de, > > tab = tab, > > style = style, > > repTab = repTab, > > orderedDesign = orderedDesign > > ); > > } > > > > I got the following error: > > > > Error in mglmLevenberg(y, design = design, dispersion = dispersion, > offset = offset, : > > BLAS/LAPACK routine 'DGEMM ' gave error code -13 > > 5 mglmLevenberg(y, design = design, dispersion = dispersion, offset = > offset, > > weights = weights, coef.start = start, maxit = 250) > > 4 glmFit.default(glmfit$counts, design = design0, offset = glmfit$offset, > > weights = glmfit$weights, dispersion = glmfit$dispersion, > > prior.count = 0) > > 3 glmFit(glmfit$counts, design = design0, offset = glmfit$offset, > > weights = glmfit$weights, dispersion = glmfit$dispersion, > > prior.count = 0) > > 2 glmLRT(fit, contrast = contrast) at diffexpr.R#15 > > 1 analyse(counts, design, contrast, countsId, style) > > > > I tried to use different models but i cannot succeed to avoid the error > for this comparison. (other comparisons do succeed) Any hints will be very > appreciated. > > > > Thanks in advance! > > > > Best regards, > > > > Sander > > > > -- output of sessionInfo(): > > > > > sessionInfo() > > R version 3.1.0 (2014-04-10) > > Platform: x86_64-pc-linux-gnu (64-bit) > > > > locale: > > [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C > LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 > LC_MONETARY=en_US.UTF-8 > > [6] LC_MESSAGES=en_US.UTF-8 LC_PAPER=en_US.UTF-8 LC_NAME=C > LC_ADDRESS=C LC_TELEPHONE=C > > [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C > > > > attached base packages: > > [1] splines stats graphics grDevices utils datasets methods > base > > > > other attached packages: > > [1] FactoMineR_1.26 reshape_0.8.5 ggplot2_1.0.0 reshape2_1.4 > edgeR_3.6.7 limma_3.20.8 > > > > loaded via a namespace (and not attached): > > [1] car_2.0-20 cluster_1.15.2 colorspace_1.2-4 > digest_0.6.4 grid_3.1.0 gtable_0.1.2 > > [7] htmltools_0.2.4 lattice_0.20-29 leaps_2.9 > MASS_7.3-33 munsell_0.4.2 nnet_7.3-8 > > [13] plyr_1.8.1 proto_0.3-10 Rcpp_0.11.2 > rmarkdown_0.2.49 scales_0.2.4 scatterplot3d_0.3-35 > > [19] stringr_0.6.2 tools_3.1.0 yaml_2.1.13 > > > > -- > > Sent via the guest posting facility at bioconductor.org. > > > > _______________________________________________ > > Bioconductor mailing list > > Bioconductor at r-project.org > > https://stat.ethz.ch/mailman/listinfo/bioconductor > > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > > > -- James W. MacDonald, M.S. Biostatistician University of Washington Environmental and Occupational Health Sciences 4225 Roosevelt Way NE, # 100 Seattle WA 98105-6099 [[alternative HTML version deleted]]
limma edgeR • 1.0k views
ADD COMMENT
0
Entering edit mode
@sander-van-der-zeeuw-6723
Last seen 10.2 years ago
Hi Jim, Sorry for replying only to you, i thought that i pressed reply all maybe i had chosen the wrong button. But to come back to my comparison i have a RNA-seq data from 4 mouse divided over 2 * 4 samples where one of the 2 is stemcell and the other isn't. I did the comparison of stemcell vs no stemcell where sample pairing is not important. The last comparison i want to do is to not look at differences in stemcell and no-stemcell, but look at the differences between the 4 mouses. Therefore i paired the samples as seen in the design. I think that indeed i should use your'e last code: contrast <- makeContrasts(subject1-subject2, subject1-subject3, subject1-subject4, subject2-subject3, subject2-subject4, subject3-subject4, levels = design). Since this is comparing all animals with each other. Thanks for the help and explanation. Best, Sander On 09/02/2014 04:19 PM, James W. MacDonald wrote: > Hi Sander, > > Please don't take conversations off-list (e.g., use 'Reply all' when > responding). > > > On Mon, Sep 1, 2014 at 10:59 AM, Sander van der Zeeuw > <s.a.j.van_der_zeeuw at="" lumc.nl="" <mailto:s.a.j.van_der_zeeuw="" at="" lumc.nl="">> wrote: > > Hi Jim, > > thanks for the information, i have been struggling for a few days > now trying to implement your tips. If i want to compare the 8 > samples divided into 4 subjects with no treatment or control or > what so ever. I have only one model which produces a reasonable > result. I just do not know what exactly is happening. This is the > contrast and design model i have been using: > > > This is a problem to start with. I have no idea what 'i want to > compare the 8 samples divided into 4 subjects with no treatment or > control or what so ever' means. You will need to be more clear on what > your goals are for me (or anybody else) to be able to help you. > > > contrast <- makeContrasts(subject1-subject2+subject3-subject4, > levels = design); > Contrasts > Levels subject1 - subject2 + subject3 - subject4 > subject1 1 > subject2 -1 > subject3 1 > subject4 -1 > > > This contrast is testing an interaction between subjects 1 and 2 > versus subjects 4 and 3. Without knowing what the subjects are, I > cannot say if this is a reasonable thing to be doing. But please note > two things: > > 1.) Any comparison in ANOVA is just simple algebra, so if you have > taken algebra, you should be able to figure out what the comparison is. > 2.) Given 1.), we can decompose your contrast and see what it is testing: > > Your contrast is > subject1 - subject2 + subject3 - subject4 > > we can rewrite that as > > (subject1 - subject2) - (subject4 - subject3) > > so at a certain level, what you are doing is computing the difference > between subjects 1 and 2, and then comparing that to the difference > between 4 and 3. If the value in the first parenthesis is pretty close > to the second, then you won't have much evidence for a difference. > Alternatively, if the values in the two parentheses are different (and > larger than expected given the intra-group variability), you will > likely reject the null hypothesis and say there appears to be a > difference. > > But what does this mean? In your case, I have no idea. In addition, I > have no idea if you are really (or should be) looking for an > interaction here. Which gets us back to my first point; what exactly > (and I mean EXACTLY) are you trying to do? > > If you simply want to do all possible comparisons, then you just set > it up in the most obvious way: > > contrast <- makeContrasts(subject1-subject2, subject1-subject3, > subject1-subject4, subject2-subject3, subject2-subject4, > subject3-subject4, levels = design) > > Best, > > Jim > > > > design <- model.matrix(~ 0 + subject, data = group ); > > design > > subject1 subject2 subject3 subject4 > S7309 1 0 0 0 > S7310 1 0 0 0 > S7311 0 1 0 0 > S7312 0 1 0 0 > S7313 0 0 1 0 > S7314 0 0 1 0 > S7315 0 0 0 1 > S7316 0 0 0 1 > attr(,"assign") > [1] 1 1 1 1 > attr(,"contrasts") > attr(,"contrasts")$subject > [1] "contr.treatment" > > Do you think that this will be the correct setup now? Because i > doubt this strongly. If you have any interesting tips please give > them to me. > > Best, > > regards, > > Sander > > > > On 08/21/2014 04:12 PM, James W. MacDonald wrote: >> >> Hi Sander, >> >> On Aug 21, 2014 8:59 AM, "Sander [guest]" <guest at="" bioconductor.org="">> <mailto:guest at="" bioconductor.org="">> wrote: >> > >> > Dear edgeR maintainers, >> > >> > I have some troubles setting up the correct design for my DE >> experiment. I now how my experiment design should look like and >> that is like this: >> > > design >> > subject1 subject2 subject3 subject4 >> > S7309 1 0 0 0 >> > S7310 1 0 0 0 >> > S7311 0 1 0 0 >> > S7312 0 1 0 0 >> > S7313 0 0 1 0 >> > S7314 0 0 1 0 >> > S7315 0 0 0 1 >> > S7316 0 0 0 1 >> > attr(,"assign") >> > [1] 1 1 1 1 >> > attr(,"contrasts") >> > attr(,"contrasts")$subject >> > [1] "contr.treatment" >> > >> > After that i make my contrasts: >> > > makeContrasts(subject1,subject2,subject3,subject4, levels = >> design) >> >> This is where you have made a mistake. Toy should be subtracting >> one subject from another, depending on the contrasts you care about. >> >> The resulting contrasts matrix should have both positive and >> negative values, not just zeros and ones. >> >> See the edgeR users guide or limma users guide for examples. >> >> Best, >> >> Jim >> >> > Contrasts >> > Levels subject1 subject2 subject3 subject4 >> > subject1 1 0 0 0 >> > subject2 0 1 0 0 >> > subject3 0 0 1 0 >> > subject4 0 0 0 1 >> > >> > This all looks right since i need to compare the subjects with >> each other. But when i run my analysis function which looks like >> this: >> > >> > library( edgeR ); >> > library( ggplot2 ); >> > library( reshape ); >> > library( FactoMineR ); >> > >> > analyse <- function( counts, design, contrast, name, style ) { >> > counts <- counts[ rowSums( counts, na.rm = TRUE ) > 0, ]; >> > y <- DGEList( counts = counts, genes = rownames( counts ) ); >> > y <- calcNormFactors( y ); >> > y <- estimateGLMCommonDisp( y, design ); >> > y <- estimateGLMTrendedDisp( y, design, df = 5 ); >> > y <- estimateGLMTagwiseDisp( y, design ); >> > >> > fit <- glmFit( y, design ); >> > lrt <- glmLRT( fit, contrast = contrast ); >> > de <- decideTestsDGE( lrt, p = 0.05, adjust = "BH" ); >> > cpmY <- cpm( y ); >> > >> > daf <- designAsFactor( design ); >> > orderedDesign <- design[ order( daf, names( daf ) ), ]; >> > tab <- data.frame( >> > row.names = rownames( cpmY ), >> > genes = rownames( cpmY ), >> > de = de, >> > cpmY[ ,order( daf, names( daf ) ) ] >> > ); >> > >> > aRepTab <- topTags( lrt, n = nrow( counts ) )$table; >> > aRepTab$rank <- 1:nrow( counts ); >> > # repTab <- tab[ match( aRepTab$genes, rownames( tab ) ), ]; >> > >> > repTab <- merge( aRepTab, tab, by = "genes", sort = FALSE ); >> > repTab <- repTab[ order( repTab$rank ), ]; >> > # data.frame( >> > # row.names = rownames( aRepTab ), >> > # aRepTab, >> > # tab[ match( aRepTab$genes, tab$genes ), ] >> > # ) >> > >> > list( >> > name = name, >> > y = y, >> > fit = fit, >> > lrt = lrt, >> > de = de, >> > tab = tab, >> > style = style, >> > repTab = repTab, >> > orderedDesign = orderedDesign >> > ); >> > } >> > >> > I got the following error: >> > >> > Error in mglmLevenberg(y, design = design, dispersion = >> dispersion, offset = offset, : >> > BLAS/LAPACK routine 'DGEMM ' gave error code -13 >> > 5 mglmLevenberg(y, design = design, dispersion = dispersion, >> offset = offset, >> > weights = weights, coef.start = start, maxit = 250) >> > 4 glmFit.default(glmfit$counts, design = design0, offset = >> glmfit$offset, >> > weights = glmfit$weights, dispersion = glmfit$dispersion, >> > prior.count = 0) >> > 3 glmFit(glmfit$counts, design = design0, offset = glmfit$offset, >> > weights = glmfit$weights, dispersion = glmfit$dispersion, >> > prior.count = 0) >> > 2 glmLRT(fit, contrast = contrast) at diffexpr.R#15 >> > 1 analyse(counts, design, contrast, countsId, style) >> > >> > I tried to use different models but i cannot succeed to avoid >> the error for this comparison. (other comparisons do succeed) Any >> hints will be very appreciated. >> > >> > Thanks in advance! >> > >> > Best regards, >> > >> > Sander >> > >> > -- output of sessionInfo(): >> > >> > > sessionInfo() >> > R version 3.1.0 (2014-04-10) >> > Platform: x86_64-pc-linux-gnu (64-bit) >> > >> > locale: >> > [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C >> LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 >> LC_MONETARY=en_US.UTF-8 >> > [6] LC_MESSAGES=en_US.UTF-8 LC_PAPER=en_US.UTF-8 >> LC_NAME=C LC_ADDRESS=C LC_TELEPHONE=C >> > [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C >> > >> > attached base packages: >> > [1] splines stats graphics grDevices utils datasets >> methods base >> > >> > other attached packages: >> > [1] FactoMineR_1.26 reshape_0.8.5 ggplot2_1.0.0 >> reshape2_1.4 edgeR_3.6.7 limma_3.20.8 >> > >> > loaded via a namespace (and not attached): >> > [1] car_2.0-20 cluster_1.15.2 colorspace_1.2-4 >> digest_0.6.4 grid_3.1.0 gtable_0.1.2 >> > [7] htmltools_0.2.4 lattice_0.20-29 leaps_2.9 >> MASS_7.3-33 munsell_0.4.2 nnet_7.3-8 >> > [13] plyr_1.8.1 proto_0.3-10 Rcpp_0.11.2 >> rmarkdown_0.2.49 scales_0.2.4 scatterplot3d_0.3-35 >> > [19] stringr_0.6.2 tools_3.1.0 yaml_2.1.13 >> > >> > -- >> > Sent via the guest posting facility at bioconductor.org >> <http: bioconductor.org="">. >> > >> > _______________________________________________ >> > Bioconductor mailing list >> > Bioconductor at r-project.org <mailto:bioconductor at="" r-project.org=""> >> > https://stat.ethz.ch/mailman/listinfo/bioconductor >> > Search the archives: >> http://news.gmane.org/gmane.science.biology.informatics.conductor >> > > > > > -- > James W. MacDonald, M.S. > Biostatistician > University of Washington > Environmental and Occupational Health Sciences > 4225 Roosevelt Way NE, # 100 > Seattle WA 98105-6099 [[alternative HTML version deleted]]
ADD COMMENT

Login before adding your answer.

Traffic: 605 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6