Odd isolated problem with GOstats
1
0
Entering edit mode
Mark W Kimpel ▴ 830
@mark-w-kimpel-2027
Last seen 9.6 years ago
I have used a custom wrapper function to invoke GOstats for almost a year without problem. Now, with just one dataset, I am getting an error. Not sure if this is a bug or something very odd about just my one dataset. The GO:ID that yields the error is a valid ID, I checked. I am using the devel version and rechecked after just updating my packages. Any ideas as to what the problem might be? Loading required package: rat2302 Error in .checkKeys(value, Lkeys(x), x at ifnotfound) : invalid key "GO:0016089" Enter a frame number, or 0 to exit 1: limma.contrast.output.func(AOP, fit2, rslt, contrast.num = 1, custom.contra 2: GO.anal.func(input.df = t.tab.annot.obj$sig.named.genes, annotationPckg = a 3: GOHyperMaxCats.func(optimizedParam) 4: data.frame(summary(test.obj)) 5: summary(test.obj) 6: summary(test.obj) 7: .local(object, ...) 8: sapply(mget(goIds, GOenv("TERM")), Term) 9: lapply(X, FUN, ...) 10: is.vector(X) 11: mget(goIds, GOenv("TERM")) 12: mget(goIds, GOenv("TERM")) 13: `keys<-`(`*tmp*`, value = c("GO:0007268", "GO:0007214", "GO:0050877", "GO:0 14: `keys<-`(`*tmp*`, value = c("GO:0007268", "GO:0007214", "GO:0050877", "GO:0 15: switch(as.character(direction(x)), "1" = `Lkeys<-`(x, value), "-1" = `Rkeys 16: `Lkeys<-`(x, value) 17: `Lkeys<-`(x, value) 18: .checkKeys(value, Lkeys(x), x at ifnotfound) > sessionInfo() R version 2.7.0 Under development (unstable) (2007-12-20 r43739) x86_64-unknown-linux-gnu locale: LC_CTYPE=en_US.UTF-8;LC_NUMERIC=C;LC_TIME=en_US.UTF-8;LC_COLLATE=en_US .UTF-8;LC_MONETARY=en_US.UTF-8;LC_MESSAGES=en_US.UTF-8;LC_PAPER=en_US. UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF-8 ;LC_IDENTIFICATION=C attached base packages: [1] splines tools stats graphics grDevices utils datasets [8] methods base other attached packages: [1] rat2302_2.0.1 affycoretools_1.11.2 annaffy_1.11.1 [4] KEGG_2.0.1 GO_2.0.1 gcrma_2.11.1 [7] matchprobes_1.11.0 biomaRt_1.13.6 RCurl_0.8-3 [10] GOstats_2.5.0 Category_2.5.0 genefilter_1.17.7 [13] survival_2.34 RBGL_1.15.7 annotate_1.17.3 [16] xtable_1.5-2 GO.db_2.1.1 AnnotationDbi_1.1.8 [19] RSQLite_0.6-4 DBI_0.2-4 graph_1.17.14 [22] limma_2.13.3 affy_1.17.3 preprocessCore_1.1.5 [25] affyio_1.7.9 Biobase_1.17.8 loaded via a namespace (and not attached): [1] cluster_1.11.9 XML_1.93-2 -- Mark W. Kimpel MD ** Neuroinformatics ** Dept. of Psychiatry Indiana University School of Medicine 15032 Hunter Court, Westfield, IN 46074 (317) 490-5129 Work, & Mobile & VoiceMail (317) 204-4202 Home (no voice mail please) mwkimpel<at>gmail<dot>com
GO limma GOstats GO limma GOstats • 1.0k views
ADD COMMENT
0
Entering edit mode
Marc Carlson ★ 7.2k
@marc-carlson-2264
Last seen 7.7 years ago
United States
Mark W Kimpel wrote: > I have used a custom wrapper function to invoke GOstats for almost a > year without problem. Now, with just one dataset, I am getting an error. > Not sure if this is a bug or something very odd about just my one > dataset. The GO:ID that yields the error is a valid ID, I checked. I am > using the devel version and rechecked after just updating my packages. > > Any ideas as to what the problem might be? > > Loading required package: rat2302 > Error in .checkKeys(value, Lkeys(x), x at ifnotfound) : > invalid key "GO:0016089" > > Enter a frame number, or 0 to exit > > 1: limma.contrast.output.func(AOP, fit2, rslt, contrast.num = 1, > custom.contra > 2: GO.anal.func(input.df = t.tab.annot.obj$sig.named.genes, > annotationPckg = a > 3: GOHyperMaxCats.func(optimizedParam) > 4: data.frame(summary(test.obj)) > 5: summary(test.obj) > 6: summary(test.obj) > 7: .local(object, ...) > 8: sapply(mget(goIds, GOenv("TERM")), Term) > 9: lapply(X, FUN, ...) > 10: is.vector(X) > 11: mget(goIds, GOenv("TERM")) > 12: mget(goIds, GOenv("TERM")) > 13: `keys<-`(`*tmp*`, value = c("GO:0007268", "GO:0007214", > "GO:0050877", "GO:0 > 14: `keys<-`(`*tmp*`, value = c("GO:0007268", "GO:0007214", > "GO:0050877", "GO:0 > 15: switch(as.character(direction(x)), "1" = `Lkeys<-`(x, value), "-1" = > `Rkeys > 16: `Lkeys<-`(x, value) > 17: `Lkeys<-`(x, value) > 18: .checkKeys(value, Lkeys(x), x at ifnotfound) > > > > sessionInfo() > R version 2.7.0 Under development (unstable) (2007-12-20 r43739) > x86_64-unknown-linux-gnu > > locale: > LC_CTYPE=en_US.UTF-8;LC_NUMERIC=C;LC_TIME=en_US.UTF-8;LC_COLLATE=en_ US.UTF-8;LC_MONETARY=en_US.UTF-8;LC_MESSAGES=en_US.UTF-8;LC_PAPER=en_U S.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF -8;LC_IDENTIFICATION=C > > attached base packages: > [1] splines tools stats graphics grDevices utils datasets > [8] methods base > > other attached packages: > [1] rat2302_2.0.1 affycoretools_1.11.2 annaffy_1.11.1 > [4] KEGG_2.0.1 GO_2.0.1 gcrma_2.11.1 > [7] matchprobes_1.11.0 biomaRt_1.13.6 RCurl_0.8-3 > [10] GOstats_2.5.0 Category_2.5.0 genefilter_1.17.7 > [13] survival_2.34 RBGL_1.15.7 annotate_1.17.3 > [16] xtable_1.5-2 GO.db_2.1.1 AnnotationDbi_1.1.8 > [19] RSQLite_0.6-4 DBI_0.2-4 graph_1.17.14 > [22] limma_2.13.3 affy_1.17.3 preprocessCore_1.1.5 > [25] affyio_1.7.9 Biobase_1.17.8 > > loaded via a namespace (and not attached): > [1] cluster_1.11.9 XML_1.93-2 > >From looking at the latest package, it appears that this term was just not in the version of GO that was used to generate the latest GO and data packages (the data for this was last updated in the Fall). You can expect an updated version of GO will be available for all packages in the spring as part of our biannual update process. We update the packages in devel as we make bug fixes etc. But we freeze the data sources twice a year (and especially GO because everything depends on it) so that they are uniform across the datasets provided by ourselves and our contributors. I hope this helps, Marc
ADD COMMENT
0
Entering edit mode
Marc Carlson wrote: > Mark W Kimpel wrote: > >> I have used a custom wrapper function to invoke GOstats for almost a >> year without problem. Now, with just one dataset, I am getting an error. >> Not sure if this is a bug or something very odd about just my one >> dataset. The GO:ID that yields the error is a valid ID, I checked. I am >> using the devel version and rechecked after just updating my packages. >> >> Any ideas as to what the problem might be? >> >> Loading required package: rat2302 >> Error in .checkKeys(value, Lkeys(x), x at ifnotfound) : >> invalid key "GO:0016089" >> >> Enter a frame number, or 0 to exit >> >> 1: limma.contrast.output.func(AOP, fit2, rslt, contrast.num = 1, >> custom.contra >> 2: GO.anal.func(input.df = t.tab.annot.obj$sig.named.genes, >> annotationPckg = a >> 3: GOHyperMaxCats.func(optimizedParam) >> 4: data.frame(summary(test.obj)) >> 5: summary(test.obj) >> 6: summary(test.obj) >> 7: .local(object, ...) >> 8: sapply(mget(goIds, GOenv("TERM")), Term) >> 9: lapply(X, FUN, ...) >> 10: is.vector(X) >> 11: mget(goIds, GOenv("TERM")) >> 12: mget(goIds, GOenv("TERM")) >> 13: `keys<-`(`*tmp*`, value = c("GO:0007268", "GO:0007214", >> "GO:0050877", "GO:0 >> 14: `keys<-`(`*tmp*`, value = c("GO:0007268", "GO:0007214", >> "GO:0050877", "GO:0 >> 15: switch(as.character(direction(x)), "1" = `Lkeys<-`(x, value), "-1" = >> `Rkeys >> 16: `Lkeys<-`(x, value) >> 17: `Lkeys<-`(x, value) >> 18: .checkKeys(value, Lkeys(x), x at ifnotfound) >> >> >> > sessionInfo() >> R version 2.7.0 Under development (unstable) (2007-12-20 r43739) >> x86_64-unknown-linux-gnu >> >> locale: >> LC_CTYPE=en_US.UTF-8;LC_NUMERIC=C;LC_TIME=en_US.UTF-8;LC_COLLATE=en _US.UTF-8;LC_MONETARY=en_US.UTF-8;LC_MESSAGES=en_US.UTF-8;LC_PAPER=en_ US.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UT F-8;LC_IDENTIFICATION=C >> >> attached base packages: >> [1] splines tools stats graphics grDevices utils datasets >> [8] methods base >> >> other attached packages: >> [1] rat2302_2.0.1 affycoretools_1.11.2 annaffy_1.11.1 >> [4] KEGG_2.0.1 GO_2.0.1 gcrma_2.11.1 >> [7] matchprobes_1.11.0 biomaRt_1.13.6 RCurl_0.8-3 >> [10] GOstats_2.5.0 Category_2.5.0 genefilter_1.17.7 >> [13] survival_2.34 RBGL_1.15.7 annotate_1.17.3 >> [16] xtable_1.5-2 GO.db_2.1.1 AnnotationDbi_1.1.8 >> [19] RSQLite_0.6-4 DBI_0.2-4 graph_1.17.14 >> [22] limma_2.13.3 affy_1.17.3 preprocessCore_1.1.5 >> [25] affyio_1.7.9 Biobase_1.17.8 >> >> loaded via a namespace (and not attached): >> [1] cluster_1.11.9 XML_1.93-2 >> >> If you could show me what it is that your script is doing (some source code) then I might be better able to help you with this. But as it stands right now I have very few specifics about what you are trying to do. I can see you posted the sessionInfo() which helps me know what versions of things you are using. But what is this script that you speak of? What does that code actually look like? I am afraid that I don't have a patch for you because the annotation package is not actually broken, and is only missing information because of what was available at GO at the precise moment when that package was made. I can't tell you why GO decided to remove that particular term and then later on decided to add it back in, but I looked at data that we gathered from them to build annotations at different time points and that is what has actually happened in this case. >From your previous posts, it looks like your trouble is caused because you are missing a key and then looking for that missing key in an environment. If that is what is happening, then I think you should code things in a way so that you can filter out keys that are not present. More specifically I think that you probably just want to do some minor cleanup before you call something like mget() to make sure that your keys are actually present. Here is a quick example of how this might look: library(GO.db) myKeys = c("FAKEID","GO:0000006","GO:0000003") myKeys[myKeys %in% keys(GOTERM)] Hope this helps, Marc
ADD REPLY
0
Entering edit mode
Marc, I have attempted to modify my code without success. The reason I was requesting a patch is because I believe the problem is originating within the summary function. Below is my code that generatesa GOHyperGParams object, applies hyperGTest, and then tries to summarize the result. I took a look at the test.object created by hyperGTest to see if it contained GO:ID's that I could remove or subset out as you suggest but it does not contain any. These seem to be fetched my mget called from within summary. I looked at the source code for summary and tried to modify it by having mget return NA ifnotfound (see further below) but this results in a problem with lapply when "standardGeneric for "Term" defined from package "AnnotationDbi"" is the FUN applied to NA. So I am stuck and would appreciate your help. Please see my comment in the R code for where I think a fix should be applied. I understand that this may only occur occasionally and only in BioC-devel, but it would seem useful to those of us who do test BioC- devel. Thanks, Mark ############################################################# #MY CODE params <- new("GOHyperGParams", geneIds = selectedEntrezIds, universeGeneIds = entrez.universe.vec, annotation = annotationPckg, ontology = "BP", pvalueCutoff = minP, conditional = TRUE, testDirection = testDirection) test.obj<-hyperGTest(params) #this is where the blow-up occurs output.table<-data.frame(summary(test.obj)) ###################################################################### # GOstats code from hyperGtable.R ################################################################## setMethod("summary", signature(object="GOHyperGResult"), function(object, pvalue=pvalueCutoff(object), categorySize=NULL, htmlLinks=FALSE) { AMIGO_URL <- "http://www.godatabase.org/cgi-bin/amigo/go.cgi?view=details&search_co nstraint=terms&depth=0&query=%s" df <- callNextMethod(object=object, pvalue=pvalue, categorySize=categorySize) if (nrow(df) == 0) { df$Term <- character(0) return(df) } #This is where it would be nice to filter out invalid goIds but I can't # figure out how to do it goIds <- df[[1]] goTerms <- sapply(mget(goIds, GOenv("TERM"), ifnotfound = NA), Term) if (htmlLinks) { goIdUrls <- sapply(goIds, function(x) sprintf(AMIGO_URL, x)) goTerms <- paste('', goTerms, '', sep="") } df$Term <- goTerms df }) Mark W. Kimpel MD ** Neuroinformatics ** Dept. of Psychiatry Indiana University School of Medicine 15032 Hunter Court, Westfield, IN 46074 (317) 490-5129 Work, & Mobile & VoiceMail (317) 204-4202 Home (no voice mail please) mwkimpel<at>gmail<dot>com ****************************************************************** Marc Carlson wrote: > Marc Carlson wrote: >> Mark W Kimpel wrote: >> >>> I have used a custom wrapper function to invoke GOstats for almost a >>> year without problem. Now, with just one dataset, I am getting an error. >>> Not sure if this is a bug or something very odd about just my one >>> dataset. The GO:ID that yields the error is a valid ID, I checked. I am >>> using the devel version and rechecked after just updating my packages. >>> >>> Any ideas as to what the problem might be? >>> >>> Loading required package: rat2302 >>> Error in .checkKeys(value, Lkeys(x), x at ifnotfound) : >>> invalid key "GO:0016089" >>> >>> Enter a frame number, or 0 to exit >>> >>> 1: limma.contrast.output.func(AOP, fit2, rslt, contrast.num = 1, >>> custom.contra >>> 2: GO.anal.func(input.df = t.tab.annot.obj$sig.named.genes, >>> annotationPckg = a >>> 3: GOHyperMaxCats.func(optimizedParam) >>> 4: data.frame(summary(test.obj)) >>> 5: summary(test.obj) >>> 6: summary(test.obj) >>> 7: .local(object, ...) >>> 8: sapply(mget(goIds, GOenv("TERM")), Term) >>> 9: lapply(X, FUN, ...) >>> 10: is.vector(X) >>> 11: mget(goIds, GOenv("TERM")) >>> 12: mget(goIds, GOenv("TERM")) >>> 13: `keys<-`(`*tmp*`, value = c("GO:0007268", "GO:0007214", >>> "GO:0050877", "GO:0 >>> 14: `keys<-`(`*tmp*`, value = c("GO:0007268", "GO:0007214", >>> "GO:0050877", "GO:0 >>> 15: switch(as.character(direction(x)), "1" = `Lkeys<-`(x, value), "-1" = >>> `Rkeys >>> 16: `Lkeys<-`(x, value) >>> 17: `Lkeys<-`(x, value) >>> 18: .checkKeys(value, Lkeys(x), x at ifnotfound) >>> >>> >>> > sessionInfo() >>> R version 2.7.0 Under development (unstable) (2007-12-20 r43739) >>> x86_64-unknown-linux-gnu >>> >>> locale: >>> LC_CTYPE=en_US.UTF-8;LC_NUMERIC=C;LC_TIME=en_US.UTF-8;LC_COLLATE=e n_US.UTF-8;LC_MONETARY=en_US.UTF-8;LC_MESSAGES=en_US.UTF-8;LC_PAPER=en _US.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.U TF-8;LC_IDENTIFICATION=C >>> >>> attached base packages: >>> [1] splines tools stats graphics grDevices utils datasets >>> [8] methods base >>> >>> other attached packages: >>> [1] rat2302_2.0.1 affycoretools_1.11.2 annaffy_1.11.1 >>> [4] KEGG_2.0.1 GO_2.0.1 gcrma_2.11.1 >>> [7] matchprobes_1.11.0 biomaRt_1.13.6 RCurl_0.8-3 >>> [10] GOstats_2.5.0 Category_2.5.0 genefilter_1.17.7 >>> [13] survival_2.34 RBGL_1.15.7 annotate_1.17.3 >>> [16] xtable_1.5-2 GO.db_2.1.1 AnnotationDbi_1.1.8 >>> [19] RSQLite_0.6-4 DBI_0.2-4 graph_1.17.14 >>> [22] limma_2.13.3 affy_1.17.3 preprocessCore_1.1.5 >>> [25] affyio_1.7.9 Biobase_1.17.8 >>> >>> loaded via a namespace (and not attached): >>> [1] cluster_1.11.9 XML_1.93-2 >>> >>> > > If you could show me what it is that your script is doing (some source > code) then I might be better able to help you with this. But as it > stands right now I have very few specifics about what you are trying to > do. I can see you posted the sessionInfo() which helps me know what > versions of things you are using. But what is this script that you > speak of? What does that code actually look like? I am afraid that I > don't have a patch for you because the annotation package is not > actually broken, and is only missing information because of what was > available at GO at the precise moment when that package was made. I > can't tell you why GO decided to remove that particular term and then > later on decided to add it back in, but I looked at data that we > gathered from them to build annotations at different time points and > that is what has actually happened in this case. > >>From your previous posts, it looks like your trouble is caused because > you are missing a key and then looking for that missing key in an > environment. If that is what is happening, then I think you should code > things in a way so that you can filter out keys that are not present. > > More specifically I think that you probably just want to do some minor > cleanup before you call something like mget() to make sure that your > keys are actually present. Here is a quick example of how this might look: > > library(GO.db) > > myKeys = c("FAKEID","GO:0000006","GO:0000003") > > myKeys[myKeys %in% keys(GOTERM)] > > > Hope this helps, > > > Marc > > >
ADD REPLY

Login before adding your answer.

Traffic: 704 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6