Error in calculating P-values with Genefilter function

0

Entering edit mode

Guest User ★ 13k

@guest-user-4897

Last seen 11.5 years ago

To whom it may concern, I am having trouble with the genefilter function in R. I am attempting to extract genes from 7 arrays using a p-value of 0.01 using the following code: Func7P0.01<-filterfun(Anova(class7,p=0.01)) Func7P0.01 Anova7_P0.01<-genefilter(SCDexprs7,Func7P0.01) Anova7_P0.01 Creating Func7P0.01 works fine, but when I run the genefilter using my data matrix and Func7P0.01 i get the following error. > Anova7_P0.01<-genefilter(SCDexprs7,Func7P0.01) Error in if (fstat < p) return(TRUE) : missing value where TRUE/FALSE needed and when I runtraceback(), I get: > traceback() 4: fun(x) 3: FUN(newX[, i], ...) 2: apply(expr, 1, flist) 1: genefilter(SCDexprs7, Func7P0.01) Im not entirely sure what is going on, but when I extract genes from the same 7 arrays, plus another array (8 arrays total) using the same code structure (below) it works fine. Func8P0.01<-filterfun(Anova(class8,p=0.01)) Func8P0.01 Anova8_P0.01<-genefilter(SCDexprs8,Func8P0.01) Anova8_P0.01 Any help with this matter would be greatly appreciated as I am not sure what else to try. Thanks in advance! Brad Cattrysse -- output of sessionInfo(): > sessionInfo() R version 3.0.0 (2013-04-03) Platform: x86_64-apple-darwin10.8.0 (64-bit) locale: [1] en_CA.UTF-8/en_CA.UTF-8/en_CA.UTF-8/C/en_CA.UTF-8/en_CA.UTF-8 attached base packages: [1] parallel stats graphics grDevices utils datasets methods [8] base other attached packages: [1] pd.mogene.1.1.st.v1_3.8.0 RSQLite_0.11.3 [3] DBI_0.2-6 ggplot2_0.9.3.1 [5] e1071_1.6-1 class_7.3-7 [7] pvac_1.8.0 pgmm_1.0 [9] mclust_4.1 cluster_1.14.4 [11] genefilter_1.42.0 oligoData_1.8.0 [13] oligo_1.24.0 Biobase_2.20.0 [15] oligoClasses_1.22.0 BiocGenerics_0.6.0 loaded via a namespace (and not attached): [1] affxparser_1.32.0 affy_1.38.1 affyio_1.28.0 [4] annotate_1.38.0 AnnotationDbi_1.22.5 BiocInstaller_1.10.1 [7] Biostrings_2.28.0 bit_1.1-10 codetools_0.2-8 [10] colorspace_1.2-2 dichromat_2.0-0 digest_0.6.3 [13] ff_2.2-11 foreach_1.4.0 GenomicRanges_1.12.2 [16] grid_3.0.0 gtable_0.1.2 IRanges_1.18.0 [19] iterators_1.0.6 labeling_0.1 MASS_7.3-26 [22] munsell_0.4 plyr_1.8 preprocessCore_1.22.0 [25] proto_0.3-10 RColorBrewer_1.0-5 reshape2_1.2.2 [28] scales_0.2.3 splines_3.0.0 stats4_3.0.0 [31] stringr_0.6.2 survival_2.37-4 tools_3.0.0 [34] XML_3.95-0.2 xtable_1.7-1 zlibbioc_1.6.0 > -- Sent via the guest posting facility at bioconductor.org.

genefilter genefilter • 2.1k views

ADD COMMENT • link updated 12.7 years ago by James W. MacDonald 68k • written 12.7 years ago by Guest User ★ 13k

0

Entering edit mode

Martin Morgan 25k

@martin-morgan-1513

Last seen 24 days ago

United States

Not sure whether you saw this part of the response to your earlier email This error is generated when the test in the 'if' statement is NA > if (NA) TRUE else FALSE Error in if (NA) TRUE else FALSE : missing value where TRUE/FALSE needed it looks like the line of code causing the problem is from 'Anova' > Anova(class7,p=0.01) function (x) { ... m1 <- lm(x ~ cov) m2 <- lm(x ~ 1) av <- anova(m2, m1) fstat <- av[["Pr(>F)"]][2] if (fstat < p) ... } You could gain more insight by debugging this function afilt = Anova(class7, p=0.01) debug(afilt) Func7P0.01<-filterfun(afilt) ... this should break into the browser (see ?browser) and allow you to step through the function, explore variables, and figure out what is going on. We'd need a short reproducible example to provide more insight... Martin On 06/03/2013 11:12 AM, Maintainer wrote: > > To whom it may concern, > > I am having trouble with the genefilter function in R. I am attempting to extract genes from 7 arrays using a p-value of 0.01 using the following code: > > Func7P0.01<-filterfun(Anova(class7,p=0.01)) > Func7P0.01 > Anova7_P0.01<-genefilter(SCDexprs7,Func7P0.01) > Anova7_P0.01 > > Creating Func7P0.01 works fine, but when I run the genefilter using my data matrix and Func7P0.01 i get the following error. > > >> Anova7_P0.01<-genefilter(SCDexprs7,Func7P0.01) > Error in if (fstat < p) return(TRUE) : > missing value where TRUE/FALSE needed > > > and when I runtraceback(), I get: > >> traceback() > 4: fun(x) > 3: FUN(newX[, i], ...) > 2: apply(expr, 1, flist) > 1: genefilter(SCDexprs7, Func7P0.01) > > > Im not entirely sure what is going on, but when I extract genes from the same 7 arrays, plus another array (8 arrays total) using the same code structure (below) it works fine. > > > Func8P0.01<-filterfun(Anova(class8,p=0.01)) > Func8P0.01 > Anova8_P0.01<-genefilter(SCDexprs8,Func8P0.01) > Anova8_P0.01 > > > Any help with this matter would be greatly appreciated as I am not sure what else to try. > > Thanks in advance! > Brad Cattrysse > > > -- output of sessionInfo(): > >> sessionInfo() > R version 3.0.0 (2013-04-03) > Platform: x86_64-apple-darwin10.8.0 (64-bit) > > locale: > [1] en_CA.UTF-8/en_CA.UTF-8/en_CA.UTF-8/C/en_CA.UTF-8/en_CA.UTF-8 > > attached base packages: > [1] parallel stats graphics grDevices utils datasets methods > [8] base > > other attached packages: > [1] pd.mogene.1.1.st.v1_3.8.0 RSQLite_0.11.3 > [3] DBI_0.2-6 ggplot2_0.9.3.1 > [5] e1071_1.6-1 class_7.3-7 > [7] pvac_1.8.0 pgmm_1.0 > [9] mclust_4.1 cluster_1.14.4 > [11] genefilter_1.42.0 oligoData_1.8.0 > [13] oligo_1.24.0 Biobase_2.20.0 > [15] oligoClasses_1.22.0 BiocGenerics_0.6.0 > > loaded via a namespace (and not attached): > [1] affxparser_1.32.0 affy_1.38.1 affyio_1.28.0 > [4] annotate_1.38.0 AnnotationDbi_1.22.5 BiocInstaller_1.10.1 > [7] Biostrings_2.28.0 bit_1.1-10 codetools_0.2-8 > [10] colorspace_1.2-2 dichromat_2.0-0 digest_0.6.3 > [13] ff_2.2-11 foreach_1.4.0 GenomicRanges_1.12.2 > [16] grid_3.0.0 gtable_0.1.2 IRanges_1.18.0 > [19] iterators_1.0.6 labeling_0.1 MASS_7.3-26 > [22] munsell_0.4 plyr_1.8 preprocessCore_1.22.0 > [25] proto_0.3-10 RColorBrewer_1.0-5 reshape2_1.2.2 > [28] scales_0.2.3 splines_3.0.0 stats4_3.0.0 > [31] stringr_0.6.2 survival_2.37-4 tools_3.0.0 > [34] XML_3.95-0.2 xtable_1.7-1 zlibbioc_1.6.0 >> > > -- > Sent via the guest posting facility at bioconductor.org. > > ____________________________________________________________________ ____ > devteam-bioc mailing list > To unsubscribe from this mailing list send a blank email to > devteam-bioc-leave at lists.fhcrc.org > You can also unsubscribe or change your personal options at > https://lists.fhcrc.org/mailman/listinfo/devteam-bioc > -- Computational Biology / Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 Location: Arnold Building M1 B861 Phone: (206) 667-2793

ADD COMMENT • link 12.7 years ago Martin Morgan 25k

0

Entering edit mode

James W. MacDonald 68k

@james-w-macdonald-5106

Last seen 3 days ago

United States

Hi Brad, On 6/3/2013 2:12 PM, Brad Cattrysse [guest] wrote: > To whom it may concern, > > I am having trouble with the genefilter function in R. I am attempting to extract genes from 7 arrays using a p-value of 0.01 using the following code: > > Func7P0.01<-filterfun(Anova(class7,p=0.01)) > Func7P0.01 > Anova7_P0.01<-genefilter(SCDexprs7,Func7P0.01) > Anova7_P0.01 > > Creating Func7P0.01 works fine, but when I run the genefilter using my data matrix and Func7P0.01 i get the following error. > > >> Anova7_P0.01<-genefilter(SCDexprs7,Func7P0.01) > Error in if (fstat< p) return(TRUE) : > missing value where TRUE/FALSE needed > > > and when I runtraceback(), I get: > >> traceback() > 4: fun(x) > 3: FUN(newX[, i], ...) > 2: apply(expr, 1, flist) > 1: genefilter(SCDexprs7, Func7P0.01) > > > Im not entirely sure what is going on, but when I extract genes from the same 7 arrays, plus another array (8 arrays total) using the same code structure (below) it works fine. My best guess would be that you have some missing data for a particular gene, and when you only have seven arrays you get to a point where you don't have enough data of one type to fit a linear model, so the code here m1 <- lm(x ~ cov) m2 <- lm(x ~ 1) av <- anova(m2, m1) from Anova() breaks. Try doing options(error = recover) and then run genefilter. You will error out at the point where things are breaking, and can look at the variables being analyzed at that point to see what the problem is. Best, Jim > > > Func8P0.01<-filterfun(Anova(class8,p=0.01)) > Func8P0.01 > Anova8_P0.01<-genefilter(SCDexprs8,Func8P0.01) > Anova8_P0.01 > > > Any help with this matter would be greatly appreciated as I am not sure what else to try. > > Thanks in advance! > Brad Cattrysse > > > -- output of sessionInfo(): > >> sessionInfo() > R version 3.0.0 (2013-04-03) > Platform: x86_64-apple-darwin10.8.0 (64-bit) > > locale: > [1] en_CA.UTF-8/en_CA.UTF-8/en_CA.UTF-8/C/en_CA.UTF-8/en_CA.UTF-8 > > attached base packages: > [1] parallel stats graphics grDevices utils datasets methods > [8] base > > other attached packages: > [1] pd.mogene.1.1.st.v1_3.8.0 RSQLite_0.11.3 > [3] DBI_0.2-6 ggplot2_0.9.3.1 > [5] e1071_1.6-1 class_7.3-7 > [7] pvac_1.8.0 pgmm_1.0 > [9] mclust_4.1 cluster_1.14.4 > [11] genefilter_1.42.0 oligoData_1.8.0 > [13] oligo_1.24.0 Biobase_2.20.0 > [15] oligoClasses_1.22.0 BiocGenerics_0.6.0 > > loaded via a namespace (and not attached): > [1] affxparser_1.32.0 affy_1.38.1 affyio_1.28.0 > [4] annotate_1.38.0 AnnotationDbi_1.22.5 BiocInstaller_1.10.1 > [7] Biostrings_2.28.0 bit_1.1-10 codetools_0.2-8 > [10] colorspace_1.2-2 dichromat_2.0-0 digest_0.6.3 > [13] ff_2.2-11 foreach_1.4.0 GenomicRanges_1.12.2 > [16] grid_3.0.0 gtable_0.1.2 IRanges_1.18.0 > [19] iterators_1.0.6 labeling_0.1 MASS_7.3-26 > [22] munsell_0.4 plyr_1.8 preprocessCore_1.22.0 > [25] proto_0.3-10 RColorBrewer_1.0-5 reshape2_1.2.2 > [28] scales_0.2.3 splines_3.0.0 stats4_3.0.0 > [31] stringr_0.6.2 survival_2.37-4 tools_3.0.0 > [34] XML_3.95-0.2 xtable_1.7-1 zlibbioc_1.6.0 > -- > Sent via the guest posting facility at bioconductor.org. > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor -- James W. MacDonald, M.S. Biostatistician University of Washington Environmental and Occupational Health Sciences 4225 Roosevelt Way NE, # 100 Seattle WA 98105-6099

ADD COMMENT • link 12.7 years ago James W. MacDonald 68k

0

Entering edit mode

James W. MacDonald 68k

@james-w-macdonald-5106

Last seen 3 days ago

United States

Hi Brad, Please don't take things off-list (e.g., in future, use reply-all). We like to think of the list archives as a searchable repository of knowledge, and if we go off-list, that aspect is lost. On 6/4/2013 11:53 AM, Bradley Cattrysse wrote: > Hi Jim, > > Thank you for the help. When I run the option(error=recover) it does show where the error is occurring, specifying that it is happening in fun(x) like when I use the traceback() function. Im not sure how to diagnose from there. We are analyzing an 8 array set, but we have deemed one array may be problematic. It works perfectly on the 8 array set, but when I drop one array I get the error. If you have any additional ideas that may help in diagnosing this problem the help would be greatly appreciated! Ideally what will happen is that when you error out, you will be able to figure out what the problem is by looking at the various frames that are available to you. As an example (which indicates that my original idea is not correct): dat <- matrix(rnorm(10000), ncol=10) dat[432,1:5] <- NA ## make sure it will break library(genefilter) fact <- factor(rep(1:2, each=5)) f <- filterfun(Anova(fact, p=0.01)) options(error=recover) genefilter(dat, f) Enter a frame number, or 0 to exit 1: genefilter(dat, f) 2: apply(expr, 1, flist) 3: FUN(newX[, i], ...) 4: fun(x) 5: lm(x ~ cov) 6: model.matrix(mt, mf, contrasts) 7: model.matrix.default(mt, mf, contrasts) 8: `contrasts<-`(`*tmp*`, value = contr.funs[1 + isOF[nn]]) Selection: 3 *<------------ I chose to enter frame #3* Called from: `contrasts<-`(`*tmp*`, value = contr.funs[1 + isOF[nn]]) Browse[1]>*ls() <------------------------ What's in here?* [1] "fun" "x" Browse[1]> x *<---------------------- What is x?* [1] NA NA NA NA NA 0.2737152 [7] 0.4907177 -0.1716024 0.2109492 1.0631105 You can then hit enter and look at other frames. This isn't an exact science. For example, frame 2 is hard to figure out: Enter a frame number, or 0 to exit 1: genefilter(dat, f) 2: apply(expr, 1, flist) 3: FUN(newX[, i], ...) 4: fun(x) 5: lm(x ~ cov) 6: model.matrix(mt, mf, contrasts) 7: model.matrix.default(mt, mf, contrasts) 8: `contrasts<-`(`*tmp*`, value = contr.funs[1 + isOF[nn]]) Selection: 2 Called from: model.matrix.default(mt, mf, contrasts) Browse[1]> ls() [1] "ans" "d" "d2" "d.ans" "d.call" "dl" "dn" [8] "dn.ans" "dn.call" "ds" "FUN" "i" "MARGIN" "newX" [15] "s.ans" "s.call" "tmp" "X" That's a lot of stuff, and fairly cryptic. But we can get some info here: Browse[1]> i [1] 432 So we know this is row 432, where we put the NAs. You just need to poke around in the various frames to try to figure out what is wrong with your data, and why you get the errors. It is always safest to do something like Browse[1]> class(X) [1] "matrix" Browse[1]> dim(X) [1] 1000 10 rather than just hitting X to see what it it, as sometimes these things are really big and you might get stuck with lots of data being output to your screen. Best, Jim > > Thanks again, > Brad > > > > ----- Original Message ----- > From: "James W. MacDonald"<jmacdon at="" uw.edu=""> > To: "Brad Cattrysse [guest]"<guest at="" bioconductor.org=""> > Cc: bioconductor at r-project.org, bcattrys at uoguelph.ca, "genefilter Maintainer"<maintainer at="" bioconductor.org=""> > Sent: Monday, June 3, 2013 2:27:19 PM > Subject: Re: [BioC] Error in calculating P-values with Genefilter function > > Hi Brad, > > On 6/3/2013 2:12 PM, Brad Cattrysse [guest] wrote: >> To whom it may concern, >> >> I am having trouble with the genefilter function in R. I am attempting to extract genes from 7 arrays using a p-value of 0.01 using the following code: >> >> Func7P0.01<-filterfun(Anova(class7,p=0.01)) >> Func7P0.01 >> Anova7_P0.01<-genefilter(SCDexprs7,Func7P0.01) >> Anova7_P0.01 >> >> Creating Func7P0.01 works fine, but when I run the genefilter using my data matrix and Func7P0.01 i get the following error. >> >> >>> Anova7_P0.01<-genefilter(SCDexprs7,Func7P0.01) >> Error in if (fstat< p) return(TRUE) : >> missing value where TRUE/FALSE needed >> >> >> and when I runtraceback(), I get: >> >>> traceback() >> 4: fun(x) >> 3: FUN(newX[, i], ...) >> 2: apply(expr, 1, flist) >> 1: genefilter(SCDexprs7, Func7P0.01) >> >> >> Im not entirely sure what is going on, but when I extract genes from the same 7 arrays, plus another array (8 arrays total) using the same code structure (below) it works fine. > My best guess would be that you have some missing data for a particular > gene, and when you only have seven arrays you get to a point where you > don't have enough data of one type to fit a linear model, so the code here > > m1<- lm(x ~ cov) > m2<- lm(x ~ 1) > av<- anova(m2, m1) > > from Anova() breaks. > > Try doing > > options(error = recover) > > and then run genefilter. You will error out at the point where things > are breaking, and can look at the variables being analyzed at that point > to see what the problem is. > > Best, > > Jim > > > >> >> Func8P0.01<-filterfun(Anova(class8,p=0.01)) >> Func8P0.01 >> Anova8_P0.01<-genefilter(SCDexprs8,Func8P0.01) >> Anova8_P0.01 >> >> >> Any help with this matter would be greatly appreciated as I am not sure what else to try. >> >> Thanks in advance! >> Brad Cattrysse >> >> >> -- output of sessionInfo(): >> >>> sessionInfo() >> R version 3.0.0 (2013-04-03) >> Platform: x86_64-apple-darwin10.8.0 (64-bit) >> >> locale: >> [1] en_CA.UTF-8/en_CA.UTF-8/en_CA.UTF-8/C/en_CA.UTF-8/en_CA.UTF-8 >> >> attached base packages: >> [1] parallel stats graphics grDevices utils datasets methods >> [8] base >> >> other attached packages: >> [1] pd.mogene.1.1.st.v1_3.8.0 RSQLite_0.11.3 >> [3] DBI_0.2-6 ggplot2_0.9.3.1 >> [5] e1071_1.6-1 class_7.3-7 >> [7] pvac_1.8.0 pgmm_1.0 >> [9] mclust_4.1 cluster_1.14.4 >> [11] genefilter_1.42.0 oligoData_1.8.0 >> [13] oligo_1.24.0 Biobase_2.20.0 >> [15] oligoClasses_1.22.0 BiocGenerics_0.6.0 >> >> loaded via a namespace (and not attached): >> [1] affxparser_1.32.0 affy_1.38.1 affyio_1.28.0 >> [4] annotate_1.38.0 AnnotationDbi_1.22.5 BiocInstaller_1.10.1 >> [7] Biostrings_2.28.0 bit_1.1-10 codetools_0.2-8 >> [10] colorspace_1.2-2 dichromat_2.0-0 digest_0.6.3 >> [13] ff_2.2-11 foreach_1.4.0 GenomicRanges_1.12.2 >> [16] grid_3.0.0 gtable_0.1.2 IRanges_1.18.0 >> [19] iterators_1.0.6 labeling_0.1 MASS_7.3-26 >> [22] munsell_0.4 plyr_1.8 preprocessCore_1.22.0 >> [25] proto_0.3-10 RColorBrewer_1.0-5 reshape2_1.2.2 >> [28] scales_0.2.3 splines_3.0.0 stats4_3.0.0 >> [31] stringr_0.6.2 survival_2.37-4 tools_3.0.0 >> [34] XML_3.95-0.2 xtable_1.7-1 zlibbioc_1.6.0 >> -- >> Sent via the guest posting facility at bioconductor.org. >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at r-project.org >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor -- James W. MacDonald, M.S. Biostatistician University of Washington Environmental and Occupational Health Sciences 4225 Roosevelt Way NE, # 100 Seattle WA 98105-6099

ADD COMMENT • link 12.7 years ago James W. MacDonald 68k

0

Entering edit mode

Hi Jim, Thanks for the additional help in trying to solve this problem. I used the option(error=recover) command and poked around like you said and found that probe 56 was giving the function a problem (like the NA in row 432 in yours). I removed that row from the data set and tried to re-run the p-value calculation to see if that would solve the problem. Although I think it solved that problem, I am now experiencing a different error with the function. There is a problem in the apply(expr, 1, flist) frame of genefilter: > Anova7_P0.01<-genefilter(check,Func7P0.01) Error in apply(expr, 1, flist) : dim(X) must have a positive length Enter a frame number, or 0 to exit 1: genefilter(check, Func7P0.01) 2: apply(expr, 1, flist) Selection: 2 Called from: genefilter(check, Func7P0.01) Browse[1]> ls() [1] "dl" "FUN" "MARGIN" "X" Browse[1]> X [1] 35555 7 Browse[1]> dim(X) NULL It says that dim(X) must have a positive length. When I browse X it says it has 35555 rows and 7 columns, which is correct for the data set. But then when I browse the dimensions of X it says NULL. Im not sure why this is? Do you have any idea what I should do to problem shoot this? Thanks again I really appreciate the help troubleshooting! Brad ----- Original Message ----- From: "James W. MacDonald" <jmacdon@uw.edu> To: "Bradley Cattrysse" <bcattrys at="" uoguelph.ca=""> Cc: Bioconductor at r-project.org Sent: Tuesday, June 4, 2013 12:21:35 PM Subject: Re: [BioC] Error in calculating P-values with Genefilter function Hi Brad, Please don't take things off-list (e.g., in future, use reply-all). We like to think of the list archives as a searchable repository of knowledge, and if we go off-list, that aspect is lost. On 6/4/2013 11:53 AM, Bradley Cattrysse wrote: > Hi Jim, > > Thank you for the help. When I run the option(error=recover) it does show where the error is occurring, specifying that it is happening in fun(x) like when I use the traceback() function. Im not sure how to diagnose from there. We are analyzing an 8 array set, but we have deemed one array may be problematic. It works perfectly on the 8 array set, but when I drop one array I get the error. If you have any additional ideas that may help in diagnosing this problem the help would be greatly appreciated! Ideally what will happen is that when you error out, you will be able to figure out what the problem is by looking at the various frames that are available to you. As an example (which indicates that my original idea is not correct): dat <- matrix(rnorm(10000), ncol=10) dat[432,1:5] <- NA ## make sure it will break library(genefilter) fact <- factor(rep(1:2, each=5)) f <- filterfun(Anova(fact, p=0.01)) options(error=recover) genefilter(dat, f) Enter a frame number, or 0 to exit 1: genefilter(dat, f) 2: apply(expr, 1, flist) 3: FUN(newX[, i], ...) 4: fun(x) 5: lm(x ~ cov) 6: model.matrix(mt, mf, contrasts) 7: model.matrix.default(mt, mf, contrasts) 8: `contrasts<-`(`*tmp*`, value = contr.funs[1 + isOF[nn]]) Selection: 3 *<------------ I chose to enter frame #3* Called from: `contrasts<-`(`*tmp*`, value = contr.funs[1 + isOF[nn]]) Browse[1]>*ls() <------------------------ What's in here?* [1] "fun" "x" Browse[1]> x *<---------------------- What is x?* [1] NA NA NA NA NA 0.2737152 [7] 0.4907177 -0.1716024 0.2109492 1.0631105 You can then hit enter and look at other frames. This isn't an exact science. For example, frame 2 is hard to figure out: Enter a frame number, or 0 to exit 1: genefilter(dat, f) 2: apply(expr, 1, flist) 3: FUN(newX[, i], ...) 4: fun(x) 5: lm(x ~ cov) 6: model.matrix(mt, mf, contrasts) 7: model.matrix.default(mt, mf, contrasts) 8: `contrasts<-`(`*tmp*`, value = contr.funs[1 + isOF[nn]]) Selection: 2 Called from: model.matrix.default(mt, mf, contrasts) Browse[1]> ls() [1] "ans" "d" "d2" "d.ans" "d.call" "dl" "dn" [8] "dn.ans" "dn.call" "ds" "FUN" "i" "MARGIN" "newX" [15] "s.ans" "s.call" "tmp" "X" That's a lot of stuff, and fairly cryptic. But we can get some info here: Browse[1]> i [1] 432 So we know this is row 432, where we put the NAs. You just need to poke around in the various frames to try to figure out what is wrong with your data, and why you get the errors. It is always safest to do something like Browse[1]> class(X) [1] "matrix" Browse[1]> dim(X) [1] 1000 10 rather than just hitting X to see what it it, as sometimes these things are really big and you might get stuck with lots of data being output to your screen. Best, Jim > > Thanks again, > Brad > > > > ----- Original Message ----- > From: "James W. MacDonald"<jmacdon at="" uw.edu=""> > To: "Brad Cattrysse [guest]"<guest at="" bioconductor.org=""> > Cc: bioconductor at r-project.org, bcattrys at uoguelph.ca, "genefilter Maintainer"<maintainer at="" bioconductor.org=""> > Sent: Monday, June 3, 2013 2:27:19 PM > Subject: Re: [BioC] Error in calculating P-values with Genefilter function > > Hi Brad, > > On 6/3/2013 2:12 PM, Brad Cattrysse [guest] wrote: >> To whom it may concern, >> >> I am having trouble with the genefilter function in R. I am attempting to extract genes from 7 arrays using a p-value of 0.01 using the following code: >> >> Func7P0.01<-filterfun(Anova(class7,p=0.01)) >> Func7P0.01 >> Anova7_P0.01<-genefilter(SCDexprs7,Func7P0.01) >> Anova7_P0.01 >> >> Creating Func7P0.01 works fine, but when I run the genefilter using my data matrix and Func7P0.01 i get the following error. >> >> >>> Anova7_P0.01<-genefilter(SCDexprs7,Func7P0.01) >> Error in if (fstat< p) return(TRUE) : >> missing value where TRUE/FALSE needed >> >> >> and when I runtraceback(), I get: >> >>> traceback() >> 4: fun(x) >> 3: FUN(newX[, i], ...) >> 2: apply(expr, 1, flist) >> 1: genefilter(SCDexprs7, Func7P0.01) >> >> >> Im not entirely sure what is going on, but when I extract genes from the same 7 arrays, plus another array (8 arrays total) using the same code structure (below) it works fine. > My best guess would be that you have some missing data for a particular > gene, and when you only have seven arrays you get to a point where you > don't have enough data of one type to fit a linear model, so the code here > > m1<- lm(x ~ cov) > m2<- lm(x ~ 1) > av<- anova(m2, m1) > > from Anova() breaks. > > Try doing > > options(error = recover) > > and then run genefilter. You will error out at the point where things > are breaking, and can look at the variables being analyzed at that point > to see what the problem is. > > Best, > > Jim > > > >> >> Func8P0.01<-filterfun(Anova(class8,p=0.01)) >> Func8P0.01 >> Anova8_P0.01<-genefilter(SCDexprs8,Func8P0.01) >> Anova8_P0.01 >> >> >> Any help with this matter would be greatly appreciated as I am not sure what else to try. >> >> Thanks in advance! >> Brad Cattrysse >> >> >> -- output of sessionInfo(): >> >>> sessionInfo() >> R version 3.0.0 (2013-04-03) >> Platform: x86_64-apple-darwin10.8.0 (64-bit) >> >> locale: >> [1] en_CA.UTF-8/en_CA.UTF-8/en_CA.UTF-8/C/en_CA.UTF-8/en_CA.UTF-8 >> >> attached base packages: >> [1] parallel stats graphics grDevices utils datasets methods >> [8] base >> >> other attached packages: >> [1] pd.mogene.1.1.st.v1_3.8.0 RSQLite_0.11.3 >> [3] DBI_0.2-6 ggplot2_0.9.3.1 >> [5] e1071_1.6-1 class_7.3-7 >> [7] pvac_1.8.0 pgmm_1.0 >> [9] mclust_4.1 cluster_1.14.4 >> [11] genefilter_1.42.0 oligoData_1.8.0 >> [13] oligo_1.24.0 Biobase_2.20.0 >> [15] oligoClasses_1.22.0 BiocGenerics_0.6.0 >> >> loaded via a namespace (and not attached): >> [1] affxparser_1.32.0 affy_1.38.1 affyio_1.28.0 >> [4] annotate_1.38.0 AnnotationDbi_1.22.5 BiocInstaller_1.10.1 >> [7] Biostrings_2.28.0 bit_1.1-10 codetools_0.2-8 >> [10] colorspace_1.2-2 dichromat_2.0-0 digest_0.6.3 >> [13] ff_2.2-11 foreach_1.4.0 GenomicRanges_1.12.2 >> [16] grid_3.0.0 gtable_0.1.2 IRanges_1.18.0 >> [19] iterators_1.0.6 labeling_0.1 MASS_7.3-26 >> [22] munsell_0.4 plyr_1.8 preprocessCore_1.22.0 >> [25] proto_0.3-10 RColorBrewer_1.0-5 reshape2_1.2.2 >> [28] scales_0.2.3 splines_3.0.0 stats4_3.0.0 >> [31] stringr_0.6.2 survival_2.37-4 tools_3.0.0 >> [34] XML_3.95-0.2 xtable_1.7-1 zlibbioc_1.6.0 >> -- >> Sent via the guest posting facility at bioconductor.org. >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at r-project.org >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor -- James W. MacDonald, M.S. Biostatistician University of Washington Environmental and Occupational Health Sciences 4225 Roosevelt Way NE, # 100 Seattle WA 98105-6099

ADD REPLY • link 12.7 years ago Bradley Cattrysse ▴ 30

0

Entering edit mode

Hi Brad, On 6/10/2013 10:34 AM, Bradley Cattrysse wrote: > Hi Jim, > Thanks for the additional help in trying to solve this problem. I used the option(error=recover) command and poked around like you said and found that probe 56 was giving the function a problem (like the NA in row 432 in yours). I removed that row from the data set and tried to re-run the p-value calculation to see if that would solve the problem. Although I think it solved that problem, I am now experiencing a different error with the function. There is a problem in the apply(expr, 1, flist) frame of genefilter: > >> Anova7_P0.01<-genefilter(check,Func7P0.01) > Error in apply(expr, 1, flist) : dim(X) must have a positive length > > Enter a frame number, or 0 to exit > > 1: genefilter(check, Func7P0.01) > 2: apply(expr, 1, flist) > > Selection: 2 > Called from: genefilter(check, Func7P0.01) > Browse[1]> ls() > [1] "dl" "FUN" "MARGIN" "X" > Browse[1]> X > [1] 35555 7 > Browse[1]> dim(X) > NULL It doesn't say that the dimensions of X are 35555 x 7. It says that X is a vector with two numbers in it, (35555 and 7) and that the dimensions of X are NULL, which stands to reason as it is a vector, which has no dimensional attributes. You might try poking around in frame 1. Usually I get better results when I look one frame higher than I think I should. Best, Jim > > It says that dim(X) must have a positive length. When I browse X it says it has 35555 rows and 7 columns, which is correct for the data set. But then when I browse the dimensions of X it says NULL. Im not sure why this is? Do you have any idea what I should do to problem shoot this? > > Thanks again I really appreciate the help troubleshooting! > Brad > > > > ----- Original Message ----- > From: "James W. MacDonald"<jmacdon at="" uw.edu=""> > To: "Bradley Cattrysse"<bcattrys at="" uoguelph.ca=""> > Cc: Bioconductor at r-project.org > Sent: Tuesday, June 4, 2013 12:21:35 PM > Subject: Re: [BioC] Error in calculating P-values with Genefilter function > > Hi Brad, > > Please don't take things off-list (e.g., in future, use reply-all). We > like to think of the list archives as a searchable repository of > knowledge, and if we go off-list, that aspect is lost. > > On 6/4/2013 11:53 AM, Bradley Cattrysse wrote: >> Hi Jim, >> >> Thank you for the help. When I run the option(error=recover) it does show where the error is occurring, specifying that it is happening in fun(x) like when I use the traceback() function. Im not sure how to diagnose from there. We are analyzing an 8 array set, but we have deemed one array may be problematic. It works perfectly on the 8 array set, but when I drop one array I get the error. If you have any additional ideas that may help in diagnosing this problem the help would be greatly appreciated! > Ideally what will happen is that when you error out, you will be able to > figure out what the problem is by looking at the various frames that are > available to you. As an example (which indicates that my original idea > is not correct): > > dat<- matrix(rnorm(10000), ncol=10) > dat[432,1:5]<- NA ## make sure it will break > library(genefilter) > fact<- factor(rep(1:2, each=5)) > f<- filterfun(Anova(fact, p=0.01)) > options(error=recover) > genefilter(dat, f) > > Enter a frame number, or 0 to exit > > 1: genefilter(dat, f) > 2: apply(expr, 1, flist) > 3: FUN(newX[, i], ...) > 4: fun(x) > 5: lm(x ~ cov) > 6: model.matrix(mt, mf, contrasts) > 7: model.matrix.default(mt, mf, contrasts) > 8: `contrasts<-`(`*tmp*`, value = contr.funs[1 + isOF[nn]]) > > Selection: 3 *<------------ I chose to enter frame #3* > Called from: `contrasts<-`(`*tmp*`, value = contr.funs[1 + isOF[nn]]) > Browse[1]>*ls()<------------------------ What's in here?* > [1] "fun" "x" > Browse[1]> x *<---------------------- What is x?* > [1] NA NA NA NA NA 0.2737152 > [7] 0.4907177 -0.1716024 0.2109492 1.0631105 > > You can then hit enter and look at other frames. This isn't an exact > science. For example, frame 2 is hard to figure out: > > Enter a frame number, or 0 to exit > > 1: genefilter(dat, f) > 2: apply(expr, 1, flist) > 3: FUN(newX[, i], ...) > 4: fun(x) > 5: lm(x ~ cov) > 6: model.matrix(mt, mf, contrasts) > 7: model.matrix.default(mt, mf, contrasts) > 8: `contrasts<-`(`*tmp*`, value = contr.funs[1 + isOF[nn]]) > > Selection: 2 > Called from: model.matrix.default(mt, mf, contrasts) > Browse[1]> ls() > [1] "ans" "d" "d2" "d.ans" "d.call" "dl" "dn" > [8] "dn.ans" "dn.call" "ds" "FUN" "i" "MARGIN" "newX" > [15] "s.ans" "s.call" "tmp" "X" > > That's a lot of stuff, and fairly cryptic. But we can get some info here: > > Browse[1]> i > [1] 432 > > So we know this is row 432, where we put the NAs. You just need to poke > around in the various frames to try to figure out what is wrong with > your data, and why you get the errors. It is always safest to do > something like > > Browse[1]> class(X) > [1] "matrix" > Browse[1]> dim(X) > [1] 1000 10 > > rather than just hitting X to see what it it, as sometimes these things > are really big and you might get stuck with lots of data being output to > your screen. > > Best, > > Jim > > > > > >> Thanks again, >> Brad >> >> >> >> ----- Original Message ----- >> From: "James W. MacDonald"<jmacdon at="" uw.edu=""> >> To: "Brad Cattrysse [guest]"<guest at="" bioconductor.org=""> >> Cc: bioconductor at r-project.org, bcattrys at uoguelph.ca, "genefilter Maintainer"<maintainer at="" bioconductor.org=""> >> Sent: Monday, June 3, 2013 2:27:19 PM >> Subject: Re: [BioC] Error in calculating P-values with Genefilter function >> >> Hi Brad, >> >> On 6/3/2013 2:12 PM, Brad Cattrysse [guest] wrote: >>> To whom it may concern, >>> >>> I am having trouble with the genefilter function in R. I am attempting to extract genes from 7 arrays using a p-value of 0.01 using the following code: >>> >>> Func7P0.01<-filterfun(Anova(class7,p=0.01)) >>> Func7P0.01 >>> Anova7_P0.01<-genefilter(SCDexprs7,Func7P0.01) >>> Anova7_P0.01 >>> >>> Creating Func7P0.01 works fine, but when I run the genefilter using my data matrix and Func7P0.01 i get the following error. >>> >>> >>>> Anova7_P0.01<-genefilter(SCDexprs7,Func7P0.01) >>> Error in if (fstat< p) return(TRUE) : >>> missing value where TRUE/FALSE needed >>> >>> >>> and when I runtraceback(), I get: >>> >>>> traceback() >>> 4: fun(x) >>> 3: FUN(newX[, i], ...) >>> 2: apply(expr, 1, flist) >>> 1: genefilter(SCDexprs7, Func7P0.01) >>> >>> >>> Im not entirely sure what is going on, but when I extract genes from the same 7 arrays, plus another array (8 arrays total) using the same code structure (below) it works fine. >> My best guess would be that you have some missing data for a particular >> gene, and when you only have seven arrays you get to a point where you >> don't have enough data of one type to fit a linear model, so the code here >> >> m1<- lm(x ~ cov) >> m2<- lm(x ~ 1) >> av<- anova(m2, m1) >> >> from Anova() breaks. >> >> Try doing >> >> options(error = recover) >> >> and then run genefilter. You will error out at the point where things >> are breaking, and can look at the variables being analyzed at that point >> to see what the problem is. >> >> Best, >> >> Jim >> >> >> >>> Func8P0.01<-filterfun(Anova(class8,p=0.01)) >>> Func8P0.01 >>> Anova8_P0.01<-genefilter(SCDexprs8,Func8P0.01) >>> Anova8_P0.01 >>> >>> >>> Any help with this matter would be greatly appreciated as I am not sure what else to try. >>> >>> Thanks in advance! >>> Brad Cattrysse >>> >>> >>> -- output of sessionInfo(): >>> >>>> sessionInfo() >>> R version 3.0.0 (2013-04-03) >>> Platform: x86_64-apple-darwin10.8.0 (64-bit) >>> >>> locale: >>> [1] en_CA.UTF-8/en_CA.UTF-8/en_CA.UTF-8/C/en_CA.UTF-8/en_CA.UTF-8 >>> >>> attached base packages: >>> [1] parallel stats graphics grDevices utils datasets methods >>> [8] base >>> >>> other attached packages: >>> [1] pd.mogene.1.1.st.v1_3.8.0 RSQLite_0.11.3 >>> [3] DBI_0.2-6 ggplot2_0.9.3.1 >>> [5] e1071_1.6-1 class_7.3-7 >>> [7] pvac_1.8.0 pgmm_1.0 >>> [9] mclust_4.1 cluster_1.14.4 >>> [11] genefilter_1.42.0 oligoData_1.8.0 >>> [13] oligo_1.24.0 Biobase_2.20.0 >>> [15] oligoClasses_1.22.0 BiocGenerics_0.6.0 >>> >>> loaded via a namespace (and not attached): >>> [1] affxparser_1.32.0 affy_1.38.1 affyio_1.28.0 >>> [4] annotate_1.38.0 AnnotationDbi_1.22.5 BiocInstaller_1.10.1 >>> [7] Biostrings_2.28.0 bit_1.1-10 codetools_0.2-8 >>> [10] colorspace_1.2-2 dichromat_2.0-0 digest_0.6.3 >>> [13] ff_2.2-11 foreach_1.4.0 GenomicRanges_1.12.2 >>> [16] grid_3.0.0 gtable_0.1.2 IRanges_1.18.0 >>> [19] iterators_1.0.6 labeling_0.1 MASS_7.3-26 >>> [22] munsell_0.4 plyr_1.8 preprocessCore_1.22.0 >>> [25] proto_0.3-10 RColorBrewer_1.0-5 reshape2_1.2.2 >>> [28] scales_0.2.3 splines_3.0.0 stats4_3.0.0 >>> [31] stringr_0.6.2 survival_2.37-4 tools_3.0.0 >>> [34] XML_3.95-0.2 xtable_1.7-1 zlibbioc_1.6.0 >>> -- >>> Sent via the guest posting facility at bioconductor.org. >>> >>> _______________________________________________ >>> Bioconductor mailing list >>> Bioconductor at r-project.org >>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor -- James W. MacDonald, M.S. Biostatistician University of Washington Environmental and Occupational Health Sciences 4225 Roosevelt Way NE, # 100 Seattle WA 98105-6099

ADD REPLY • link 12.7 years ago James W. MacDonald 68k

0

Entering edit mode

Hi Jim, I see what you mean, I was thinking it was giving me the number of observations in X. I will poke around some more, thanks again for the help! Brad ----- Original Message ----- From: "James W. MacDonald" <jmacdon@uw.edu> To: "Bradley Cattrysse" <bcattrys at="" uoguelph.ca=""> Cc: Bioconductor at r-project.org Sent: Tuesday, June 11, 2013 10:13:37 AM Subject: Re: [BioC] Error in calculating P-values with Genefilter function Hi Brad, On 6/10/2013 10:34 AM, Bradley Cattrysse wrote: > Hi Jim, > Thanks for the additional help in trying to solve this problem. I used the option(error=recover) command and poked around like you said and found that probe 56 was giving the function a problem (like the NA in row 432 in yours). I removed that row from the data set and tried to re-run the p-value calculation to see if that would solve the problem. Although I think it solved that problem, I am now experiencing a different error with the function. There is a problem in the apply(expr, 1, flist) frame of genefilter: > >> Anova7_P0.01<-genefilter(check,Func7P0.01) > Error in apply(expr, 1, flist) : dim(X) must have a positive length > > Enter a frame number, or 0 to exit > > 1: genefilter(check, Func7P0.01) > 2: apply(expr, 1, flist) > > Selection: 2 > Called from: genefilter(check, Func7P0.01) > Browse[1]> ls() > [1] "dl" "FUN" "MARGIN" "X" > Browse[1]> X > [1] 35555 7 > Browse[1]> dim(X) > NULL It doesn't say that the dimensions of X are 35555 x 7. It says that X is a vector with two numbers in it, (35555 and 7) and that the dimensions of X are NULL, which stands to reason as it is a vector, which has no dimensional attributes. You might try poking around in frame 1. Usually I get better results when I look one frame higher than I think I should. Best, Jim > > It says that dim(X) must have a positive length. When I browse X it says it has 35555 rows and 7 columns, which is correct for the data set. But then when I browse the dimensions of X it says NULL. Im not sure why this is? Do you have any idea what I should do to problem shoot this? > > Thanks again I really appreciate the help troubleshooting! > Brad > > > > ----- Original Message ----- > From: "James W. MacDonald"<jmacdon at="" uw.edu=""> > To: "Bradley Cattrysse"<bcattrys at="" uoguelph.ca=""> > Cc: Bioconductor at r-project.org > Sent: Tuesday, June 4, 2013 12:21:35 PM > Subject: Re: [BioC] Error in calculating P-values with Genefilter function > > Hi Brad, > > Please don't take things off-list (e.g., in future, use reply-all). We > like to think of the list archives as a searchable repository of > knowledge, and if we go off-list, that aspect is lost. > > On 6/4/2013 11:53 AM, Bradley Cattrysse wrote: >> Hi Jim, >> >> Thank you for the help. When I run the option(error=recover) it does show where the error is occurring, specifying that it is happening in fun(x) like when I use the traceback() function. Im not sure how to diagnose from there. We are analyzing an 8 array set, but we have deemed one array may be problematic. It works perfectly on the 8 array set, but when I drop one array I get the error. If you have any additional ideas that may help in diagnosing this problem the help would be greatly appreciated! > Ideally what will happen is that when you error out, you will be able to > figure out what the problem is by looking at the various frames that are > available to you. As an example (which indicates that my original idea > is not correct): > > dat<- matrix(rnorm(10000), ncol=10) > dat[432,1:5]<- NA ## make sure it will break > library(genefilter) > fact<- factor(rep(1:2, each=5)) > f<- filterfun(Anova(fact, p=0.01)) > options(error=recover) > genefilter(dat, f) > > Enter a frame number, or 0 to exit > > 1: genefilter(dat, f) > 2: apply(expr, 1, flist) > 3: FUN(newX[, i], ...) > 4: fun(x) > 5: lm(x ~ cov) > 6: model.matrix(mt, mf, contrasts) > 7: model.matrix.default(mt, mf, contrasts) > 8: `contrasts<-`(`*tmp*`, value = contr.funs[1 + isOF[nn]]) > > Selection: 3 *<------------ I chose to enter frame #3* > Called from: `contrasts<-`(`*tmp*`, value = contr.funs[1 + isOF[nn]]) > Browse[1]>*ls()<------------------------ What's in here?* > [1] "fun" "x" > Browse[1]> x *<---------------------- What is x?* > [1] NA NA NA NA NA 0.2737152 > [7] 0.4907177 -0.1716024 0.2109492 1.0631105 > > You can then hit enter and look at other frames. This isn't an exact > science. For example, frame 2 is hard to figure out: > > Enter a frame number, or 0 to exit > > 1: genefilter(dat, f) > 2: apply(expr, 1, flist) > 3: FUN(newX[, i], ...) > 4: fun(x) > 5: lm(x ~ cov) > 6: model.matrix(mt, mf, contrasts) > 7: model.matrix.default(mt, mf, contrasts) > 8: `contrasts<-`(`*tmp*`, value = contr.funs[1 + isOF[nn]]) > > Selection: 2 > Called from: model.matrix.default(mt, mf, contrasts) > Browse[1]> ls() > [1] "ans" "d" "d2" "d.ans" "d.call" "dl" "dn" > [8] "dn.ans" "dn.call" "ds" "FUN" "i" "MARGIN" "newX" > [15] "s.ans" "s.call" "tmp" "X" > > That's a lot of stuff, and fairly cryptic. But we can get some info here: > > Browse[1]> i > [1] 432 > > So we know this is row 432, where we put the NAs. You just need to poke > around in the various frames to try to figure out what is wrong with > your data, and why you get the errors. It is always safest to do > something like > > Browse[1]> class(X) > [1] "matrix" > Browse[1]> dim(X) > [1] 1000 10 > > rather than just hitting X to see what it it, as sometimes these things > are really big and you might get stuck with lots of data being output to > your screen. > > Best, > > Jim > > > > > >> Thanks again, >> Brad >> >> >> >> ----- Original Message ----- >> From: "James W. MacDonald"<jmacdon at="" uw.edu=""> >> To: "Brad Cattrysse [guest]"<guest at="" bioconductor.org=""> >> Cc: bioconductor at r-project.org, bcattrys at uoguelph.ca, "genefilter Maintainer"<maintainer at="" bioconductor.org=""> >> Sent: Monday, June 3, 2013 2:27:19 PM >> Subject: Re: [BioC] Error in calculating P-values with Genefilter function >> >> Hi Brad, >> >> On 6/3/2013 2:12 PM, Brad Cattrysse [guest] wrote: >>> To whom it may concern, >>> >>> I am having trouble with the genefilter function in R. I am attempting to extract genes from 7 arrays using a p-value of 0.01 using the following code: >>> >>> Func7P0.01<-filterfun(Anova(class7,p=0.01)) >>> Func7P0.01 >>> Anova7_P0.01<-genefilter(SCDexprs7,Func7P0.01) >>> Anova7_P0.01 >>> >>> Creating Func7P0.01 works fine, but when I run the genefilter using my data matrix and Func7P0.01 i get the following error. >>> >>> >>>> Anova7_P0.01<-genefilter(SCDexprs7,Func7P0.01) >>> Error in if (fstat< p) return(TRUE) : >>> missing value where TRUE/FALSE needed >>> >>> >>> and when I runtraceback(), I get: >>> >>>> traceback() >>> 4: fun(x) >>> 3: FUN(newX[, i], ...) >>> 2: apply(expr, 1, flist) >>> 1: genefilter(SCDexprs7, Func7P0.01) >>> >>> >>> Im not entirely sure what is going on, but when I extract genes from the same 7 arrays, plus another array (8 arrays total) using the same code structure (below) it works fine. >> My best guess would be that you have some missing data for a particular >> gene, and when you only have seven arrays you get to a point where you >> don't have enough data of one type to fit a linear model, so the code here >> >> m1<- lm(x ~ cov) >> m2<- lm(x ~ 1) >> av<- anova(m2, m1) >> >> from Anova() breaks. >> >> Try doing >> >> options(error = recover) >> >> and then run genefilter. You will error out at the point where things >> are breaking, and can look at the variables being analyzed at that point >> to see what the problem is. >> >> Best, >> >> Jim >> >> >> >>> Func8P0.01<-filterfun(Anova(class8,p=0.01)) >>> Func8P0.01 >>> Anova8_P0.01<-genefilter(SCDexprs8,Func8P0.01) >>> Anova8_P0.01 >>> >>> >>> Any help with this matter would be greatly appreciated as I am not sure what else to try. >>> >>> Thanks in advance! >>> Brad Cattrysse >>> >>> >>> -- output of sessionInfo(): >>> >>>> sessionInfo() >>> R version 3.0.0 (2013-04-03) >>> Platform: x86_64-apple-darwin10.8.0 (64-bit) >>> >>> locale: >>> [1] en_CA.UTF-8/en_CA.UTF-8/en_CA.UTF-8/C/en_CA.UTF-8/en_CA.UTF-8 >>> >>> attached base packages: >>> [1] parallel stats graphics grDevices utils datasets methods >>> [8] base >>> >>> other attached packages: >>> [1] pd.mogene.1.1.st.v1_3.8.0 RSQLite_0.11.3 >>> [3] DBI_0.2-6 ggplot2_0.9.3.1 >>> [5] e1071_1.6-1 class_7.3-7 >>> [7] pvac_1.8.0 pgmm_1.0 >>> [9] mclust_4.1 cluster_1.14.4 >>> [11] genefilter_1.42.0 oligoData_1.8.0 >>> [13] oligo_1.24.0 Biobase_2.20.0 >>> [15] oligoClasses_1.22.0 BiocGenerics_0.6.0 >>> >>> loaded via a namespace (and not attached): >>> [1] affxparser_1.32.0 affy_1.38.1 affyio_1.28.0 >>> [4] annotate_1.38.0 AnnotationDbi_1.22.5 BiocInstaller_1.10.1 >>> [7] Biostrings_2.28.0 bit_1.1-10 codetools_0.2-8 >>> [10] colorspace_1.2-2 dichromat_2.0-0 digest_0.6.3 >>> [13] ff_2.2-11 foreach_1.4.0 GenomicRanges_1.12.2 >>> [16] grid_3.0.0 gtable_0.1.2 IRanges_1.18.0 >>> [19] iterators_1.0.6 labeling_0.1 MASS_7.3-26 >>> [22] munsell_0.4 plyr_1.8 preprocessCore_1.22.0 >>> [25] proto_0.3-10 RColorBrewer_1.0-5 reshape2_1.2.2 >>> [28] scales_0.2.3 splines_3.0.0 stats4_3.0.0 >>> [31] stringr_0.6.2 survival_2.37-4 tools_3.0.0 >>> [34] XML_3.95-0.2 xtable_1.7-1 zlibbioc_1.6.0 >>> -- >>> Sent via the guest posting facility at bioconductor.org. >>> >>> _______________________________________________ >>> Bioconductor mailing list >>> Bioconductor at r-project.org >>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor -- James W. MacDonald, M.S. Biostatistician University of Washington Environmental and Occupational Health Sciences 4225 Roosevelt Way NE, # 100 Seattle WA 98105-6099

ADD REPLY • link 12.7 years ago Bradley Cattrysse ▴ 30

0

Entering edit mode

And this brings me back to the admonition that you should always do something like class(X) or dim(X) first. Depending on how you are running R, if X really were a 35555 x 7 matrix or data.frame and you just typed X at the prompt, you will get the whole thing output to your screen (or up to the row limit set in options). I run R under emacs, and sometimes it isn't possible to get R to stop that nonsense without doing a kill command at a terminal prompt. Best, Jim On 6/11/2013 10:21 AM, Bradley Cattrysse wrote: > Hi Jim, > > I see what you mean, I was thinking it was giving me the number of observations in X. I will poke around some more, thanks again for the help! > Brad > > ----- Original Message ----- > From: "James W. MacDonald"<jmacdon at="" uw.edu=""> > To: "Bradley Cattrysse"<bcattrys at="" uoguelph.ca=""> > Cc: Bioconductor at r-project.org > Sent: Tuesday, June 11, 2013 10:13:37 AM > Subject: Re: [BioC] Error in calculating P-values with Genefilter function > > Hi Brad, > > On 6/10/2013 10:34 AM, Bradley Cattrysse wrote: >> Hi Jim, >> Thanks for the additional help in trying to solve this problem. I used the option(error=recover) command and poked around like you said and found that probe 56 was giving the function a problem (like the NA in row 432 in yours). I removed that row from the data set and tried to re-run the p-value calculation to see if that would solve the problem. Although I think it solved that problem, I am now experiencing a different error with the function. There is a problem in the apply(expr, 1, flist) frame of genefilter: >> >>> Anova7_P0.01<-genefilter(check,Func7P0.01) >> Error in apply(expr, 1, flist) : dim(X) must have a positive length >> >> Enter a frame number, or 0 to exit >> >> 1: genefilter(check, Func7P0.01) >> 2: apply(expr, 1, flist) >> >> Selection: 2 >> Called from: genefilter(check, Func7P0.01) >> Browse[1]> ls() >> [1] "dl" "FUN" "MARGIN" "X" >> Browse[1]> X >> [1] 35555 7 >> Browse[1]> dim(X) >> NULL > It doesn't say that the dimensions of X are 35555 x 7. It says that X is > a vector with two numbers in it, (35555 and 7) and that the dimensions > of X are NULL, which stands to reason as it is a vector, which has no > dimensional attributes. > > You might try poking around in frame 1. Usually I get better results > when I look one frame higher than I think I should. > > Best, > > Jim > > > >> It says that dim(X) must have a positive length. When I browse X it says it has 35555 rows and 7 columns, which is correct for the data set. But then when I browse the dimensions of X it says NULL. Im not sure why this is? Do you have any idea what I should do to problem shoot this? >> >> Thanks again I really appreciate the help troubleshooting! >> Brad >> >> >> >> ----- Original Message ----- >> From: "James W. MacDonald"<jmacdon at="" uw.edu=""> >> To: "Bradley Cattrysse"<bcattrys at="" uoguelph.ca=""> >> Cc: Bioconductor at r-project.org >> Sent: Tuesday, June 4, 2013 12:21:35 PM >> Subject: Re: [BioC] Error in calculating P-values with Genefilter function >> >> Hi Brad, >> >> Please don't take things off-list (e.g., in future, use reply-all). We >> like to think of the list archives as a searchable repository of >> knowledge, and if we go off-list, that aspect is lost. >> >> On 6/4/2013 11:53 AM, Bradley Cattrysse wrote: >>> Hi Jim, >>> >>> Thank you for the help. When I run the option(error=recover) it does show where the error is occurring, specifying that it is happening in fun(x) like when I use the traceback() function. Im not sure how to diagnose from there. We are analyzing an 8 array set, but we have deemed one array may be problematic. It works perfectly on the 8 array set, but when I drop one array I get the error. If you have any additional ideas that may help in diagnosing this problem the help would be greatly appreciated! >> Ideally what will happen is that when you error out, you will be able to >> figure out what the problem is by looking at the various frames that are >> available to you. As an example (which indicates that my original idea >> is not correct): >> >> dat<- matrix(rnorm(10000), ncol=10) >> dat[432,1:5]<- NA ## make sure it will break >> library(genefilter) >> fact<- factor(rep(1:2, each=5)) >> f<- filterfun(Anova(fact, p=0.01)) >> options(error=recover) >> genefilter(dat, f) >> >> Enter a frame number, or 0 to exit >> >> 1: genefilter(dat, f) >> 2: apply(expr, 1, flist) >> 3: FUN(newX[, i], ...) >> 4: fun(x) >> 5: lm(x ~ cov) >> 6: model.matrix(mt, mf, contrasts) >> 7: model.matrix.default(mt, mf, contrasts) >> 8: `contrasts<-`(`*tmp*`, value = contr.funs[1 + isOF[nn]]) >> >> Selection: 3 *<------------ I chose to enter frame #3* >> Called from: `contrasts<-`(`*tmp*`, value = contr.funs[1 + isOF[nn]]) >> Browse[1]>*ls()<------------------------ What's in here?* >> [1] "fun" "x" >> Browse[1]> x *<---------------------- What is x?* >> [1] NA NA NA NA NA 0.2737152 >> [7] 0.4907177 -0.1716024 0.2109492 1.0631105 >> >> You can then hit enter and look at other frames. This isn't an exact >> science. For example, frame 2 is hard to figure out: >> >> Enter a frame number, or 0 to exit >> >> 1: genefilter(dat, f) >> 2: apply(expr, 1, flist) >> 3: FUN(newX[, i], ...) >> 4: fun(x) >> 5: lm(x ~ cov) >> 6: model.matrix(mt, mf, contrasts) >> 7: model.matrix.default(mt, mf, contrasts) >> 8: `contrasts<-`(`*tmp*`, value = contr.funs[1 + isOF[nn]]) >> >> Selection: 2 >> Called from: model.matrix.default(mt, mf, contrasts) >> Browse[1]> ls() >> [1] "ans" "d" "d2" "d.ans" "d.call" "dl" "dn" >> [8] "dn.ans" "dn.call" "ds" "FUN" "i" "MARGIN" "newX" >> [15] "s.ans" "s.call" "tmp" "X" >> >> That's a lot of stuff, and fairly cryptic. But we can get some info here: >> >> Browse[1]> i >> [1] 432 >> >> So we know this is row 432, where we put the NAs. You just need to poke >> around in the various frames to try to figure out what is wrong with >> your data, and why you get the errors. It is always safest to do >> something like >> >> Browse[1]> class(X) >> [1] "matrix" >> Browse[1]> dim(X) >> [1] 1000 10 >> >> rather than just hitting X to see what it it, as sometimes these things >> are really big and you might get stuck with lots of data being output to >> your screen. >> >> Best, >> >> Jim >> >> >> >> >> >>> Thanks again, >>> Brad >>> >>> >>> >>> ----- Original Message ----- >>> From: "James W. MacDonald"<jmacdon at="" uw.edu=""> >>> To: "Brad Cattrysse [guest]"<guest at="" bioconductor.org=""> >>> Cc: bioconductor at r-project.org, bcattrys at uoguelph.ca, "genefilter Maintainer"<maintainer at="" bioconductor.org=""> >>> Sent: Monday, June 3, 2013 2:27:19 PM >>> Subject: Re: [BioC] Error in calculating P-values with Genefilter function >>> >>> Hi Brad, >>> >>> On 6/3/2013 2:12 PM, Brad Cattrysse [guest] wrote: >>>> To whom it may concern, >>>> >>>> I am having trouble with the genefilter function in R. I am attempting to extract genes from 7 arrays using a p-value of 0.01 using the following code: >>>> >>>> Func7P0.01<-filterfun(Anova(class7,p=0.01)) >>>> Func7P0.01 >>>> Anova7_P0.01<-genefilter(SCDexprs7,Func7P0.01) >>>> Anova7_P0.01 >>>> >>>> Creating Func7P0.01 works fine, but when I run the genefilter using my data matrix and Func7P0.01 i get the following error. >>>> >>>> >>>>> Anova7_P0.01<-genefilter(SCDexprs7,Func7P0.01) >>>> Error in if (fstat< p) return(TRUE) : >>>> missing value where TRUE/FALSE needed >>>> >>>> >>>> and when I runtraceback(), I get: >>>> >>>>> traceback() >>>> 4: fun(x) >>>> 3: FUN(newX[, i], ...) >>>> 2: apply(expr, 1, flist) >>>> 1: genefilter(SCDexprs7, Func7P0.01) >>>> >>>> >>>> Im not entirely sure what is going on, but when I extract genes from the same 7 arrays, plus another array (8 arrays total) using the same code structure (below) it works fine. >>> My best guess would be that you have some missing data for a particular >>> gene, and when you only have seven arrays you get to a point where you >>> don't have enough data of one type to fit a linear model, so the code here >>> >>> m1<- lm(x ~ cov) >>> m2<- lm(x ~ 1) >>> av<- anova(m2, m1) >>> >>> from Anova() breaks. >>> >>> Try doing >>> >>> options(error = recover) >>> >>> and then run genefilter. You will error out at the point where things >>> are breaking, and can look at the variables being analyzed at that point >>> to see what the problem is. >>> >>> Best, >>> >>> Jim >>> >>> >>> >>>> Func8P0.01<-filterfun(Anova(class8,p=0.01)) >>>> Func8P0.01 >>>> Anova8_P0.01<-genefilter(SCDexprs8,Func8P0.01) >>>> Anova8_P0.01 >>>> >>>> >>>> Any help with this matter would be greatly appreciated as I am not sure what else to try. >>>> >>>> Thanks in advance! >>>> Brad Cattrysse >>>> >>>> >>>> -- output of sessionInfo(): >>>> >>>>> sessionInfo() >>>> R version 3.0.0 (2013-04-03) >>>> Platform: x86_64-apple-darwin10.8.0 (64-bit) >>>> >>>> locale: >>>> [1] en_CA.UTF-8/en_CA.UTF-8/en_CA.UTF-8/C/en_CA.UTF-8/en_CA.UTF-8 >>>> >>>> attached base packages: >>>> [1] parallel stats graphics grDevices utils datasets methods >>>> [8] base >>>> >>>> other attached packages: >>>> [1] pd.mogene.1.1.st.v1_3.8.0 RSQLite_0.11.3 >>>> [3] DBI_0.2-6 ggplot2_0.9.3.1 >>>> [5] e1071_1.6-1 class_7.3-7 >>>> [7] pvac_1.8.0 pgmm_1.0 >>>> [9] mclust_4.1 cluster_1.14.4 >>>> [11] genefilter_1.42.0 oligoData_1.8.0 >>>> [13] oligo_1.24.0 Biobase_2.20.0 >>>> [15] oligoClasses_1.22.0 BiocGenerics_0.6.0 >>>> >>>> loaded via a namespace (and not attached): >>>> [1] affxparser_1.32.0 affy_1.38.1 affyio_1.28.0 >>>> [4] annotate_1.38.0 AnnotationDbi_1.22.5 BiocInstaller_1.10.1 >>>> [7] Biostrings_2.28.0 bit_1.1-10 codetools_0.2-8 >>>> [10] colorspace_1.2-2 dichromat_2.0-0 digest_0.6.3 >>>> [13] ff_2.2-11 foreach_1.4.0 GenomicRanges_1.12.2 >>>> [16] grid_3.0.0 gtable_0.1.2 IRanges_1.18.0 >>>> [19] iterators_1.0.6 labeling_0.1 MASS_7.3-26 >>>> [22] munsell_0.4 plyr_1.8 preprocessCore_1.22.0 >>>> [25] proto_0.3-10 RColorBrewer_1.0-5 reshape2_1.2.2 >>>> [28] scales_0.2.3 splines_3.0.0 stats4_3.0.0 >>>> [31] stringr_0.6.2 survival_2.37-4 tools_3.0.0 >>>> [34] XML_3.95-0.2 xtable_1.7-1 zlibbioc_1.6.0 >>>> -- >>>> Sent via the guest posting facility at bioconductor.org. >>>> >>>> _______________________________________________ >>>> Bioconductor mailing list >>>> Bioconductor at r-project.org >>>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>>> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor -- James W. MacDonald, M.S. Biostatistician University of Washington Environmental and Occupational Health Sciences 4225 Roosevelt Way NE, # 100 Seattle WA 98105-6099

ADD REPLY • link 12.7 years ago James W. MacDonald 68k

0

Entering edit mode

I see, that makes sense. I was misunderstanding what X was giving me, and thus approaching the problem from the wrong angle. Your help is much appreciated, and hopefully this will point me in the right direction in solving the problem. Thanks again, Brad ----- Original Message ----- From: "James W. MacDonald" <jmacdon@uw.edu> To: "Bradley Cattrysse" <bcattrys at="" uoguelph.ca=""> Cc: Bioconductor at r-project.org Sent: Tuesday, June 11, 2013 10:27:57 AM Subject: Re: [BioC] Error in calculating P-values with Genefilter function And this brings me back to the admonition that you should always do something like class(X) or dim(X) first. Depending on how you are running R, if X really were a 35555 x 7 matrix or data.frame and you just typed X at the prompt, you will get the whole thing output to your screen (or up to the row limit set in options). I run R under emacs, and sometimes it isn't possible to get R to stop that nonsense without doing a kill command at a terminal prompt. Best, Jim On 6/11/2013 10:21 AM, Bradley Cattrysse wrote: > Hi Jim, > > I see what you mean, I was thinking it was giving me the number of observations in X. I will poke around some more, thanks again for the help! > Brad > > ----- Original Message ----- > From: "James W. MacDonald"<jmacdon at="" uw.edu=""> > To: "Bradley Cattrysse"<bcattrys at="" uoguelph.ca=""> > Cc: Bioconductor at r-project.org > Sent: Tuesday, June 11, 2013 10:13:37 AM > Subject: Re: [BioC] Error in calculating P-values with Genefilter function > > Hi Brad, > > On 6/10/2013 10:34 AM, Bradley Cattrysse wrote: >> Hi Jim, >> Thanks for the additional help in trying to solve this problem. I used the option(error=recover) command and poked around like you said and found that probe 56 was giving the function a problem (like the NA in row 432 in yours). I removed that row from the data set and tried to re-run the p-value calculation to see if that would solve the problem. Although I think it solved that problem, I am now experiencing a different error with the function. There is a problem in the apply(expr, 1, flist) frame of genefilter: >> >>> Anova7_P0.01<-genefilter(check,Func7P0.01) >> Error in apply(expr, 1, flist) : dim(X) must have a positive length >> >> Enter a frame number, or 0 to exit >> >> 1: genefilter(check, Func7P0.01) >> 2: apply(expr, 1, flist) >> >> Selection: 2 >> Called from: genefilter(check, Func7P0.01) >> Browse[1]> ls() >> [1] "dl" "FUN" "MARGIN" "X" >> Browse[1]> X >> [1] 35555 7 >> Browse[1]> dim(X) >> NULL > It doesn't say that the dimensions of X are 35555 x 7. It says that X is > a vector with two numbers in it, (35555 and 7) and that the dimensions > of X are NULL, which stands to reason as it is a vector, which has no > dimensional attributes. > > You might try poking around in frame 1. Usually I get better results > when I look one frame higher than I think I should. > > Best, > > Jim > > > >> It says that dim(X) must have a positive length. When I browse X it says it has 35555 rows and 7 columns, which is correct for the data set. But then when I browse the dimensions of X it says NULL. Im not sure why this is? Do you have any idea what I should do to problem shoot this? >> >> Thanks again I really appreciate the help troubleshooting! >> Brad >> >> >> >> ----- Original Message ----- >> From: "James W. MacDonald"<jmacdon at="" uw.edu=""> >> To: "Bradley Cattrysse"<bcattrys at="" uoguelph.ca=""> >> Cc: Bioconductor at r-project.org >> Sent: Tuesday, June 4, 2013 12:21:35 PM >> Subject: Re: [BioC] Error in calculating P-values with Genefilter function >> >> Hi Brad, >> >> Please don't take things off-list (e.g., in future, use reply-all). We >> like to think of the list archives as a searchable repository of >> knowledge, and if we go off-list, that aspect is lost. >> >> On 6/4/2013 11:53 AM, Bradley Cattrysse wrote: >>> Hi Jim, >>> >>> Thank you for the help. When I run the option(error=recover) it does show where the error is occurring, specifying that it is happening in fun(x) like when I use the traceback() function. Im not sure how to diagnose from there. We are analyzing an 8 array set, but we have deemed one array may be problematic. It works perfectly on the 8 array set, but when I drop one array I get the error. If you have any additional ideas that may help in diagnosing this problem the help would be greatly appreciated! >> Ideally what will happen is that when you error out, you will be able to >> figure out what the problem is by looking at the various frames that are >> available to you. As an example (which indicates that my original idea >> is not correct): >> >> dat<- matrix(rnorm(10000), ncol=10) >> dat[432,1:5]<- NA ## make sure it will break >> library(genefilter) >> fact<- factor(rep(1:2, each=5)) >> f<- filterfun(Anova(fact, p=0.01)) >> options(error=recover) >> genefilter(dat, f) >> >> Enter a frame number, or 0 to exit >> >> 1: genefilter(dat, f) >> 2: apply(expr, 1, flist) >> 3: FUN(newX[, i], ...) >> 4: fun(x) >> 5: lm(x ~ cov) >> 6: model.matrix(mt, mf, contrasts) >> 7: model.matrix.default(mt, mf, contrasts) >> 8: `contrasts<-`(`*tmp*`, value = contr.funs[1 + isOF[nn]]) >> >> Selection: 3 *<------------ I chose to enter frame #3* >> Called from: `contrasts<-`(`*tmp*`, value = contr.funs[1 + isOF[nn]]) >> Browse[1]>*ls()<------------------------ What's in here?* >> [1] "fun" "x" >> Browse[1]> x *<---------------------- What is x?* >> [1] NA NA NA NA NA 0.2737152 >> [7] 0.4907177 -0.1716024 0.2109492 1.0631105 >> >> You can then hit enter and look at other frames. This isn't an exact >> science. For example, frame 2 is hard to figure out: >> >> Enter a frame number, or 0 to exit >> >> 1: genefilter(dat, f) >> 2: apply(expr, 1, flist) >> 3: FUN(newX[, i], ...) >> 4: fun(x) >> 5: lm(x ~ cov) >> 6: model.matrix(mt, mf, contrasts) >> 7: model.matrix.default(mt, mf, contrasts) >> 8: `contrasts<-`(`*tmp*`, value = contr.funs[1 + isOF[nn]]) >> >> Selection: 2 >> Called from: model.matrix.default(mt, mf, contrasts) >> Browse[1]> ls() >> [1] "ans" "d" "d2" "d.ans" "d.call" "dl" "dn" >> [8] "dn.ans" "dn.call" "ds" "FUN" "i" "MARGIN" "newX" >> [15] "s.ans" "s.call" "tmp" "X" >> >> That's a lot of stuff, and fairly cryptic. But we can get some info here: >> >> Browse[1]> i >> [1] 432 >> >> So we know this is row 432, where we put the NAs. You just need to poke >> around in the various frames to try to figure out what is wrong with >> your data, and why you get the errors. It is always safest to do >> something like >> >> Browse[1]> class(X) >> [1] "matrix" >> Browse[1]> dim(X) >> [1] 1000 10 >> >> rather than just hitting X to see what it it, as sometimes these things >> are really big and you might get stuck with lots of data being output to >> your screen. >> >> Best, >> >> Jim >> >> >> >> >> >>> Thanks again, >>> Brad >>> >>> >>> >>> ----- Original Message ----- >>> From: "James W. MacDonald"<jmacdon at="" uw.edu=""> >>> To: "Brad Cattrysse [guest]"<guest at="" bioconductor.org=""> >>> Cc: bioconductor at r-project.org, bcattrys at uoguelph.ca, "genefilter Maintainer"<maintainer at="" bioconductor.org=""> >>> Sent: Monday, June 3, 2013 2:27:19 PM >>> Subject: Re: [BioC] Error in calculating P-values with Genefilter function >>> >>> Hi Brad, >>> >>> On 6/3/2013 2:12 PM, Brad Cattrysse [guest] wrote: >>>> To whom it may concern, >>>> >>>> I am having trouble with the genefilter function in R. I am attempting to extract genes from 7 arrays using a p-value of 0.01 using the following code: >>>> >>>> Func7P0.01<-filterfun(Anova(class7,p=0.01)) >>>> Func7P0.01 >>>> Anova7_P0.01<-genefilter(SCDexprs7,Func7P0.01) >>>> Anova7_P0.01 >>>> >>>> Creating Func7P0.01 works fine, but when I run the genefilter using my data matrix and Func7P0.01 i get the following error. >>>> >>>> >>>>> Anova7_P0.01<-genefilter(SCDexprs7,Func7P0.01) >>>> Error in if (fstat< p) return(TRUE) : >>>> missing value where TRUE/FALSE needed >>>> >>>> >>>> and when I runtraceback(), I get: >>>> >>>>> traceback() >>>> 4: fun(x) >>>> 3: FUN(newX[, i], ...) >>>> 2: apply(expr, 1, flist) >>>> 1: genefilter(SCDexprs7, Func7P0.01) >>>> >>>> >>>> Im not entirely sure what is going on, but when I extract genes from the same 7 arrays, plus another array (8 arrays total) using the same code structure (below) it works fine. >>> My best guess would be that you have some missing data for a particular >>> gene, and when you only have seven arrays you get to a point where you >>> don't have enough data of one type to fit a linear model, so the code here >>> >>> m1<- lm(x ~ cov) >>> m2<- lm(x ~ 1) >>> av<- anova(m2, m1) >>> >>> from Anova() breaks. >>> >>> Try doing >>> >>> options(error = recover) >>> >>> and then run genefilter. You will error out at the point where things >>> are breaking, and can look at the variables being analyzed at that point >>> to see what the problem is. >>> >>> Best, >>> >>> Jim >>> >>> >>> >>>> Func8P0.01<-filterfun(Anova(class8,p=0.01)) >>>> Func8P0.01 >>>> Anova8_P0.01<-genefilter(SCDexprs8,Func8P0.01) >>>> Anova8_P0.01 >>>> >>>> >>>> Any help with this matter would be greatly appreciated as I am not sure what else to try. >>>> >>>> Thanks in advance! >>>> Brad Cattrysse >>>> >>>> >>>> -- output of sessionInfo(): >>>> >>>>> sessionInfo() >>>> R version 3.0.0 (2013-04-03) >>>> Platform: x86_64-apple-darwin10.8.0 (64-bit) >>>> >>>> locale: >>>> [1] en_CA.UTF-8/en_CA.UTF-8/en_CA.UTF-8/C/en_CA.UTF-8/en_CA.UTF-8 >>>> >>>> attached base packages: >>>> [1] parallel stats graphics grDevices utils datasets methods >>>> [8] base >>>> >>>> other attached packages: >>>> [1] pd.mogene.1.1.st.v1_3.8.0 RSQLite_0.11.3 >>>> [3] DBI_0.2-6 ggplot2_0.9.3.1 >>>> [5] e1071_1.6-1 class_7.3-7 >>>> [7] pvac_1.8.0 pgmm_1.0 >>>> [9] mclust_4.1 cluster_1.14.4 >>>> [11] genefilter_1.42.0 oligoData_1.8.0 >>>> [13] oligo_1.24.0 Biobase_2.20.0 >>>> [15] oligoClasses_1.22.0 BiocGenerics_0.6.0 >>>> >>>> loaded via a namespace (and not attached): >>>> [1] affxparser_1.32.0 affy_1.38.1 affyio_1.28.0 >>>> [4] annotate_1.38.0 AnnotationDbi_1.22.5 BiocInstaller_1.10.1 >>>> [7] Biostrings_2.28.0 bit_1.1-10 codetools_0.2-8 >>>> [10] colorspace_1.2-2 dichromat_2.0-0 digest_0.6.3 >>>> [13] ff_2.2-11 foreach_1.4.0 GenomicRanges_1.12.2 >>>> [16] grid_3.0.0 gtable_0.1.2 IRanges_1.18.0 >>>> [19] iterators_1.0.6 labeling_0.1 MASS_7.3-26 >>>> [22] munsell_0.4 plyr_1.8 preprocessCore_1.22.0 >>>> [25] proto_0.3-10 RColorBrewer_1.0-5 reshape2_1.2.2 >>>> [28] scales_0.2.3 splines_3.0.0 stats4_3.0.0 >>>> [31] stringr_0.6.2 survival_2.37-4 tools_3.0.0 >>>> [34] XML_3.95-0.2 xtable_1.7-1 zlibbioc_1.6.0 >>>> -- >>>> Sent via the guest posting facility at bioconductor.org. >>>> >>>> _______________________________________________ >>>> Bioconductor mailing list >>>> Bioconductor at r-project.org >>>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>>> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor -- James W. MacDonald, M.S. Biostatistician University of Washington Environmental and Occupational Health Sciences 4225 Roosevelt Way NE, # 100 Seattle WA 98105-6099

ADD REPLY • link 12.7 years ago Bradley Cattrysse ▴ 30

0

Entering edit mode

More baR tRivia! On Tue, Jun 11, 2013 at 7:27 AM, James W. MacDonald <jmacdon at="" uw.edu=""> wrote: > And this brings me back to the admonition that you should always do > something like > > class(X) > > or > > dim(X) > > first. Depending on how you are running R, if X really were a 35555 x 7 > matrix or data.frame and you just typed X at the prompt, you will get the > whole thing output to your screen (or up to the row limit set in options). I > run R under emacs, and sometimes it isn't possible to get R to stop that > nonsense without doing a kill command at a terminal prompt. I've co-opted the code from the data.table package to change how data.frames are printed to stdout for this very reason. I've accidentally dumped huge data.frames to the screen often enough to make me wonder if programming with punch cards wouldn't be a better alternative than staring at emacs trying to plow through the output while I furiously tap ctrl-g or ctrl-c or something. Anyway, the end result is to have R print out data.frames in a very IRanges/DataFrame like way, actually. Monster data.frames now don't bother me at all: R> df <- data.frame(name=sample(letters, 10000, replace=TRUE), score=rnorm(10000)) R> df ## "NOO!", you say .. name score 1 | b 1.1804078 2 | j 0.8143630 3 | t -0.1430033 4 | n -1.3588291 5 | o 2.7989686 --- 9996 | h -0.1317207 9997 | c 0.1645823 9998 | w 0.5061355 9999 | a -1.6761684 10000| c 0.3653244 Below is the code I just have pasted into my ~/.Rprofile to make this happen. I'm sure there's a better place for it, though, but this works for 99% of the time that I shoot myself in the foot (except for when I'm in the `browser()` (debugger), actually -- if someone wants to offer a fix for that, I'd be greatful ;-) Without further ado: ## ------------------------------------------------------------------- ---------- ## A saner default to print data.frames from data.table ###################################################################### ######### ## Change print data.frame to be data.table like format.data.frame <- function (x, ..., justify = "none") { format.item <- function(x) { if (is.atomic(x)) paste(c(head(x,6),if(length(x)>6)""),collapse=",") else paste("<",class(x)[1L],">",sep="") } do.call("cbind", lapply(x, function(col, ...){ if (is.list(col)) col <- sapply(col, format.item) format(col, justify=justify, ...) })) } print.data.frame <- function (x, nrows=100L, digits=NULL, ...) { if (nrow(x) == 0L) { if (length(x)==0L) cat("NULL data.frame\n") else cat("Empty data.frame (0 rows) of ", length(x), " col", if(length(x)>1L) "s", ": ", paste(head(names(x),6),collapse=","), if (ncol(x)>6) "...", "\n", sep="") return() } printdots<-FALSE n <- 5 if (nrow(x)>nrows) { if (missing(nrows)) { ##msg<-paste("First",nrows,"rows of",nrow(x),"printed.") toprint <- rbind(head(x,n),tail(x,n)) rn <- c(seq_len(n),seq.int(to=nrow(x),length.out=n)) printdots <- TRUE } else { toprint <- head(x,nrows) rn <- seq_len(nrows) } } else { toprint <- x rn <- seq_len(nrow(x)) } ## Replace idx with rownames rn <- rownames(x)[rn] toprint<-format.data.frame(toprint, digits=digits, na.encode = FALSE) rownames(toprint)<-paste(format(rn,right=TRUE),"|",sep="") if (printdots) { toprint <- rbind(head(toprint,n),"---"="",tail(toprint,n)) rownames(toprint) <- format(rownames(toprint),justify="right") print(toprint,right=TRUE,quote=FALSE) return(invisible()) } if (nrow(toprint)>20L) ## repeat colnames at the bottom if over 20 rows so you don't have to scroll up to see them toprint<-rbind(toprint,matrix(names(x),nrow=1)) print(toprint,right=TRUE,quote=FALSE) invisible() } ############### End code ################ -steve -- Steve Lianoglou Computational Biologist Bioinformatics and Computational Biology Genentech

ADD REPLY • link 12.7 years ago Steve Lianoglou ★ 13k

Login before adding your answer.