GOstats problem with output
2
0
Entering edit mode
Assa Yeroslaviz ★ 1.5k
@assa-yeroslaviz-1597
Last seen 6 days ago
Germany
Hi, I am trying to run a HyerGTest with GOstats on a mouse genome entrez IDs. The Ids I have imported from biomart: entrez_data_1 <- getBM(attributes=c("mgi_id","entrezgene"), filters= "mgi_id", values = as.character(data_1$MGI),mart = mart) head(entrez_data_1) entrezID_Universe <-getBM(mart = mart, attributes = c("mgi_id", "entrezgene"), filters ="mgi_id", values =as.character(MaxQuant18$MGI)) entrezID_Universe params <- new("GOHyperGParams", geneIds = as.character(entrez_data_1[,2]), universeGeneIds = as.character(entrezID_Universe[,2]), annotation = "org.Mm.eg.db", ontology = "BP", pvalueCutoff = 0.05, conditional = FALSE, testDirection = "over") I Than tried to run the HyperGTest command with success MmOverBP <- hyperGTest(paramsBP) MmOverBP Gene to GO BP test for over-representation 3146 GO BP ids tested (118 have p < 0.05) Selected gene set size: 1006 Gene universe size: 2935 Annotation package: org.Mm.eg but than: summary(MmOverBP) > summary(MmOverBP) Error in .checkKeys(value, Lkeys(x), x@ifnotfound) : value for "GO:2000021" not found As far as I know, I have the latest version of both packages. I looked in AmiGO whether this GO Id exists: it does. AccessionGO:2000021OntologyBiological ProcessSynonymsrelated: regulation of electrolyte homeostasis related: regulation of negative regulation of crystal biosynthesisrelated: regulation of negative regulation of crystal formation Is there a way of putting/annotating this specific item manually, so that I can see it? If not- Is there a way of extracting this GO ID from the list of GO categories, so that I can see the results? Thanks a lot Assa > sessionInfo() R version 2.12.2 (2011-02-25) Platform: x86_64-pc-linux-gnu (64-bit) locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 [5] LC_MONETARY=C LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=en_US.UTF-8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] splines grid stats graphics grDevices utils datasets [8] methods base other attached packages: [1] GO.db_2.4.1 org.Mm.eg.db_2.4.6 biomaRt_2.6.0 [4] Heatplus_1.20.0 gplots_2.8.0 caTools_1.11 [7] bitops_1.0-4.1 gdata_2.8.1 gtools_2.6.2 [10] siggenes_1.24.0 multtest_2.7.1 Rgraphviz_1.29.0 [13] xtable_1.5-6 annotate_1.28.1 GOstats_2.16.0 [16] RSQLite_0.9-4 DBI_0.2-5 graph_1.28.0 [19] Category_2.16.0 AnnotationDbi_1.12.0 Biobase_2.10.0 loaded via a namespace (and not attached): [1] genefilter_1.32.0 GSEABase_1.12.1 MASS_7.3-11 RBGL_1.26.0 [5] RCurl_1.5-0 survival_2.36-5 tcltk_2.12.2 tools_2.12.2 [9] XML_3.2-0 [[alternative HTML version deleted]]
Annotation GO GOstats Annotation GO GOstats • 2.3k views
ADD COMMENT
0
Entering edit mode
Marc Carlson ★ 7.2k
@marc-carlson-2264
Last seen 8.2 years ago
United States
Hi Assa, The error that you reported suggests that the GO ID you have mapped to an entrez gene ID inside of your org.Mm.eg.db (which is where that GO2ALL mapping is from) is not present in your GO.db package so I think we should start by looking to see if your GO.db package is up to date. Looking at your sessionInfo() I can see that you have an old stale version of GO.db (2.4.1). You should be using GO.db version 2.4.5 if you want to use org.Mm.eg.db version 2.4.6. The annotations that are released for each version of Bioconductor are meant to be used as a matched set. You can avoid having to worry about all of this using biocLite() to install all of the packages that you plan to use. biocLite() should always install the appropriate version of a given Bioconductor package for whichever version of R you happen to be running. You can read about biocLite() here on our website where we explain how to install and update packages for Bioconductor: http://www.bioconductor.org/install/ Marc On 04/07/2011 08:22 AM, Assa Yeroslaviz wrote: > Hi, > > I am trying to run a HyerGTest with GOstats on a mouse genome entrez IDs. > > The Ids I have imported from biomart: > entrez_data_1<- getBM(attributes=c("mgi_id","entrezgene"), filters= > "mgi_id", values = as.character(data_1$MGI),mart = mart) > head(entrez_data_1) > entrezID_Universe<-getBM(mart = mart, attributes = c("mgi_id", > "entrezgene"), filters ="mgi_id", values =as.character(MaxQuant18$MGI)) > entrezID_Universe > params<- new("GOHyperGParams", geneIds = as.character(entrez_data_1[,2]), > universeGeneIds = as.character(entrezID_Universe[,2]), annotation = > "org.Mm.eg.db", ontology = "BP", pvalueCutoff = 0.05, conditional = FALSE, > testDirection = "over") > I Than tried to run the HyperGTest command with success > MmOverBP<- hyperGTest(paramsBP) > MmOverBP > Gene to GO BP test for over-representation > 3146 GO BP ids tested (118 have p< 0.05) > Selected gene set size: 1006 > Gene universe size: 2935 > Annotation package: org.Mm.eg > but than: > summary(MmOverBP) >> summary(MmOverBP) > Error in .checkKeys(value, Lkeys(x), x at ifnotfound) : > value for "GO:2000021" not found > > As far as I know, I have the latest version of both packages. I looked in > AmiGO whether this GO Id exists: it does. > AccessionGO:2000021OntologyBiological ProcessSynonymsrelated: regulation of > electrolyte homeostasis related: regulation of negative regulation of > crystal biosynthesisrelated: regulation of negative regulation of crystal > formation Is there a way of putting/annotating this specific item manually, > so that I can see it? > If not- > Is there a way of extracting this GO ID from the list of GO categories, so > that I can see the results? > > Thanks a lot > Assa > > >> sessionInfo() > R version 2.12.2 (2011-02-25) > Platform: x86_64-pc-linux-gnu (64-bit) > > locale: > [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C > [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 > [5] LC_MONETARY=C LC_MESSAGES=en_US.UTF-8 > [7] LC_PAPER=en_US.UTF-8 LC_NAME=C > [9] LC_ADDRESS=C LC_TELEPHONE=C > [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C > > attached base packages: > [1] splines grid stats graphics grDevices utils datasets > [8] methods base > > other attached packages: > [1] GO.db_2.4.1 org.Mm.eg.db_2.4.6 biomaRt_2.6.0 > [4] Heatplus_1.20.0 gplots_2.8.0 caTools_1.11 > [7] bitops_1.0-4.1 gdata_2.8.1 gtools_2.6.2 > [10] siggenes_1.24.0 multtest_2.7.1 Rgraphviz_1.29.0 > [13] xtable_1.5-6 annotate_1.28.1 GOstats_2.16.0 > [16] RSQLite_0.9-4 DBI_0.2-5 graph_1.28.0 > [19] Category_2.16.0 AnnotationDbi_1.12.0 Biobase_2.10.0 > > loaded via a namespace (and not attached): > [1] genefilter_1.32.0 GSEABase_1.12.1 MASS_7.3-11 RBGL_1.26.0 > [5] RCurl_1.5-0 survival_2.36-5 tcltk_2.12.2 tools_2.12.2 > [9] XML_3.2-0 > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
ADD COMMENT
0
Entering edit mode
@robert-m-flight-4158
Last seen 12 weeks ago
United States
Hi Assa, As far as I am aware, if the GO term comes up in your list, then there should be genes annotated to it. I did a simple test to verify that the GO term does exist: crud <- as.list(GOTERM) > crud$'GO:2000021' GOID: GO:2000021 Term: regulation of ion homeostasis Ontology: BP Definition: Any process that modulates the frequency, rate or extent of ion homeostasis. Synonym: regulation of electrolyte homeostasis Synonym: regulation of negative regulation of crystal biosynthesis Synonym: regulation of negative regulation of crystal formation So far so good. Now lets look to see what genes are annotated to it: > library(org.Mm.eg.db) > mget('GO:2000021',org.Mm.egGO) Error in .checkKeys(value, Lkeys(x), x at ifnotfound) : value for "GO:2000021" not found > mget('GO:2000021',org.Mm.egGO2EG) Error in .checkKeys(value, Rkeys(x), x at ifnotfound) : value for "GO:2000021" not found > mget('GO:2000021',org.Mm.egGO2ALLEGS) $`GO:2000021` ISO ISO ISO ISO IGI IGI IMP IGI ISO ISO IMP ISO ISO IDA "11517" "11684" "11998" "12000" "12018" "12028" "12028" "12043" "12061" "12257" "12291" "12349" "12372" "12389" ISO ISO ISO ISO ISO IMP ISO ISO IDA IMP IMP IGI IGI ISO "12424" "12558" "13167" "13489" "13617" "13666" "14062" "14126" "14225" "14225" "14226" "14629" "14630" "14652" ISO IDA IDA ISO IDA ISO IC ISO IMP IMP IDA IMP ISO ISO "15171" "15978" "16818" "16867" "16963" "17096" "17131" "18429" "18439" "18764" "19264" "20190" "21333" "21336" ISO ISO IMP ISO ISO TAS IDA ISO ISO ISO ISO ISO ISO ISO "21803" "21808" "21819" "21838" "22041" "22784" "23832" "24111" "26361" "50849" "54140" "76055" "76757" "108837" ISO IMP ISO ISO IMP ISO "217369" "225908" "233081" "238276" "259277" "317757" BTW, this was all using GO.db_2.4.5 >From this information, there are no genes that are directly annotated to your GO term, only indirect annotations. I know this doesn't help your current situation, but it points towards the problem at least. I thought, however, when the summary was being prepared that it used the GO2ALLEGS mapping, and not the direct one. Perhaps someone more knowledgeable can figure out where in the code the error is likely to be? -Robert Robert M. Flight, Ph.D. University of Louisville Bioinformatics Laboratory University of Louisville Louisville, KY PH 502-852-1809 (HSC) PH 502-852-0467 (Belknap) EM robert.flight at louisville.edu EM rflight79 at gmail.com Williams and Holland's Law: ? ? ?? If enough data is collected, anything may be proven by statistical methods. On Thu, Apr 7, 2011 at 11:22, Assa Yeroslaviz <frymor at="" gmail.com=""> wrote: > Hi, > > I am trying to run a HyerGTest with GOstats on a mouse genome entrez IDs. > > The Ids I have imported from biomart: > entrez_data_1 <- getBM(attributes=c("mgi_id","entrezgene"), filters= > "mgi_id", values = as.character(data_1$MGI),mart = mart) > head(entrez_data_1) > entrezID_Universe <-getBM(mart = mart, attributes = c("mgi_id", > "entrezgene"), filters ="mgi_id", values =as.character(MaxQuant18$MGI)) > entrezID_Universe > params <- new("GOHyperGParams", geneIds = as.character(entrez_data_1[,2]), > universeGeneIds = as.character(entrezID_Universe[,2]), annotation = > "org.Mm.eg.db", ontology = "BP", pvalueCutoff = 0.05, conditional = FALSE, > testDirection = "over") > I Than tried to run the HyperGTest command with success > MmOverBP <- hyperGTest(paramsBP) > MmOverBP > Gene to GO BP ?test for over-representation > 3146 GO BP ids tested (118 have p < 0.05) > Selected gene set size: 1006 > ? ?Gene universe size: 2935 > ? ?Annotation package: org.Mm.eg > but than: > summary(MmOverBP) >> summary(MmOverBP) > Error in .checkKeys(value, Lkeys(x), x at ifnotfound) : > ?value for "GO:2000021" not found > > As far as I know, I have the latest version of both packages. I looked in > AmiGO whether this GO Id exists: it does. > AccessionGO:2000021OntologyBiological ProcessSynonymsrelated: regulation of > electrolyte homeostasis related: regulation of negative regulation of > crystal biosynthesisrelated: regulation of negative regulation of crystal > formation Is there a way of putting/annotating this specific item manually, > so that I can see it? > If not- > Is there a way of extracting this GO ID from the list of GO categories, so > that I can see the results? > > Thanks a lot > Assa > > >> sessionInfo() > R version 2.12.2 (2011-02-25) > Platform: x86_64-pc-linux-gnu (64-bit) > > locale: > ?[1] LC_CTYPE=en_US.UTF-8 ? ? ? LC_NUMERIC=C > ?[3] LC_TIME=en_US.UTF-8 ? ? ? ?LC_COLLATE=en_US.UTF-8 > ?[5] LC_MONETARY=C ? ? ? ? ? ? ?LC_MESSAGES=en_US.UTF-8 > ?[7] LC_PAPER=en_US.UTF-8 ? ? ? LC_NAME=C > ?[9] LC_ADDRESS=C ? ? ? ? ? ? ? LC_TELEPHONE=C > [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C > > attached base packages: > [1] splines ? grid ? ? ?stats ? ? graphics ?grDevices utils ? ? datasets > [8] methods ? base > > other attached packages: > ?[1] GO.db_2.4.1 ? ? ? ? ?org.Mm.eg.db_2.4.6 ? biomaRt_2.6.0 > ?[4] Heatplus_1.20.0 ? ? ?gplots_2.8.0 ? ? ? ? caTools_1.11 > ?[7] bitops_1.0-4.1 ? ? ? gdata_2.8.1 ? ? ? ? ?gtools_2.6.2 > [10] siggenes_1.24.0 ? ? ?multtest_2.7.1 ? ? ? Rgraphviz_1.29.0 > [13] xtable_1.5-6 ? ? ? ? annotate_1.28.1 ? ? ?GOstats_2.16.0 > [16] RSQLite_0.9-4 ? ? ? ?DBI_0.2-5 ? ? ? ? ? ?graph_1.28.0 > [19] Category_2.16.0 ? ? ?AnnotationDbi_1.12.0 Biobase_2.10.0 > > loaded via a namespace (and not attached): > [1] genefilter_1.32.0 GSEABase_1.12.1 ? MASS_7.3-11 ? ? ? RBGL_1.26.0 > [5] RCurl_1.5-0 ? ? ? survival_2.36-5 ? tcltk_2.12.2 ? ? ?tools_2.12.2 > [9] XML_3.2-0 > > ? ? ? ?[[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor >
ADD COMMENT
0
Entering edit mode
Well well, I am ashamed to say that it is now working. Apparently all I needed to do was to update the packages. I installed the new version of GO.db and GOstats and it is working now. Also I am still getting this error when trying to find which genes are attached to it. > mget('GO:2000021',org.Mm.egGO) Error in .checkKeys(value, Lkeys(x), x@ifnotfound) : value for "GO:2000021" not found > mget('GO:2000021',org.Mm.egGO2EG) Error in .checkKeys(value, Rkeys(x), x@ifnotfound) : value for "GO:2000021" not found So I guess the earlier error message as nothing to do with the fact that there are no genes from the mouse genome mapped to this GO category When I checked in AmiGo to see if there are no genes from mouse under this category, I found 83 genes. Can anyone tell me than what's the meaning of this error? Is there a way of manually update the GO data set, so that I can map these genes? Thanks Assa > sessionInfo() R version 2.12.2 (2011-02-25) Platform: x86_64-pc-linux-gnu (64-bit) locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 [5] LC_MONETARY=C LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=en_US.UTF-8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] splines grid stats graphics grDevices utils datasets [8] methods base other attached packages: [1] GSEABase_1.12.1 org.Mm.eg.db_2.4.6 biomaRt_2.6.0 [4] Heatplus_1.20.0 ggplot2_0.8.9 proto_0.3-9.1 [7] reshape_0.8.4 plyr_1.4 gplots_2.8.0 [10] caTools_1.11 bitops_1.0-4.1 gdata_2.8.1 [13] gtools_2.6.2 siggenes_1.24.0 multtest_2.7.1 [16] Rgraphviz_1.29.0 xtable_1.5-6 annotate_1.28.1 [19] GO.db_2.4.5 GOstats_2.16.0 RSQLite_0.9-4 [22] DBI_0.2-5 graph_1.28.0 Category_2.16.0 [25] AnnotationDbi_1.12.0 Biobase_2.10.0 loaded via a namespace (and not attached): [1] genefilter_1.32.0 MASS_7.3-11 RBGL_1.26.0 RCurl_1.5-0 [5] survival_2.36-5 tools_2.12.2 XML_3.2-0 On Thu, Apr 7, 2011 at 18:49, Robert M. Flight <rflight79@gmail.com> wrote: > Hi Assa, > > As far as I am aware, if the GO term comes up in your list, then there > should be genes annotated to it. I did a simple test to verify that > the GO term does exist: > > crud <- as.list(GOTERM) > > crud$'GO:2000021' > GOID: GO:2000021 > Term: regulation of ion homeostasis > Ontology: BP > Definition: Any process that modulates the frequency, rate or extent > of ion homeostasis. > Synonym: regulation of electrolyte homeostasis > Synonym: regulation of negative regulation of crystal biosynthesis > Synonym: regulation of negative regulation of crystal formation > > So far so good. Now lets look to see what genes are annotated to it: > > > library(org.Mm.eg.db) > > mget('GO:2000021',org.Mm.egGO) > Error in .checkKeys(value, Lkeys(x), x@ifnotfound) : > value for "GO:2000021" not found > > > mget('GO:2000021',org.Mm.egGO2EG) > Error in .checkKeys(value, Rkeys(x), x@ifnotfound) : > value for "GO:2000021" not found > > mget('GO:2000021',org.Mm.egGO2ALLEGS) > $`GO:2000021` > ISO ISO ISO ISO IGI IGI IMP > IGI ISO ISO IMP ISO ISO IDA > "11517" "11684" "11998" "12000" "12018" "12028" "12028" > "12043" "12061" "12257" "12291" "12349" "12372" "12389" > ISO ISO ISO ISO ISO IMP ISO > ISO IDA IMP IMP IGI IGI ISO > "12424" "12558" "13167" "13489" "13617" "13666" "14062" > "14126" "14225" "14225" "14226" "14629" "14630" "14652" > ISO IDA IDA ISO IDA ISO IC > ISO IMP IMP IDA IMP ISO ISO > "15171" "15978" "16818" "16867" "16963" "17096" "17131" > "18429" "18439" "18764" "19264" "20190" "21333" "21336" > ISO ISO IMP ISO ISO TAS IDA > ISO ISO ISO ISO ISO ISO ISO > "21803" "21808" "21819" "21838" "22041" "22784" "23832" > "24111" "26361" "50849" "54140" "76055" "76757" "108837" > ISO IMP ISO ISO IMP ISO > "217369" "225908" "233081" "238276" "259277" "317757" > > BTW, this was all using GO.db_2.4.5 > > From this information, there are no genes that are directly annotated > to your GO term, only indirect annotations. I know this doesn't help > your current situation, but it points towards the problem at least. I > thought, however, when the summary was being prepared that it used the > GO2ALLEGS mapping, and not the direct one. Perhaps someone more > knowledgeable can figure out where in the code the error is likely to > be? > > -Robert > > Robert M. Flight, Ph.D. > University of Louisville Bioinformatics Laboratory > University of Louisville > Louisville, KY > > PH 502-852-1809 (HSC) > PH 502-852-0467 (Belknap) > EM robert.flight@louisville.edu > EM rflight79@gmail.com > > Williams and Holland's Law: > If enough data is collected, anything may be proven by > statistical methods. > > > > On Thu, Apr 7, 2011 at 11:22, Assa Yeroslaviz <frymor@gmail.com> wrote: > > Hi, > > > > I am trying to run a HyerGTest with GOstats on a mouse genome entrez IDs. > > > > The Ids I have imported from biomart: > > entrez_data_1 <- getBM(attributes=c("mgi_id","entrezgene"), filters= > > "mgi_id", values = as.character(data_1$MGI),mart = mart) > > head(entrez_data_1) > > entrezID_Universe <-getBM(mart = mart, attributes = c("mgi_id", > > "entrezgene"), filters ="mgi_id", values =as.character(MaxQuant18$MGI)) > > entrezID_Universe > > params <- new("GOHyperGParams", geneIds = > as.character(entrez_data_1[,2]), > > universeGeneIds = as.character(entrezID_Universe[,2]), annotation = > > "org.Mm.eg.db", ontology = "BP", pvalueCutoff = 0.05, conditional = > FALSE, > > testDirection = "over") > > I Than tried to run the HyperGTest command with success > > MmOverBP <- hyperGTest(paramsBP) > > MmOverBP > > Gene to GO BP test for over-representation > > 3146 GO BP ids tested (118 have p < 0.05) > > Selected gene set size: 1006 > > Gene universe size: 2935 > > Annotation package: org.Mm.eg > > but than: > > summary(MmOverBP) > >> summary(MmOverBP) > > Error in .checkKeys(value, Lkeys(x), x@ifnotfound) : > > value for "GO:2000021" not found > > > > As far as I know, I have the latest version of both packages. I looked in > > AmiGO whether this GO Id exists: it does. > > AccessionGO:2000021OntologyBiological ProcessSynonymsrelated: regulation > of > > electrolyte homeostasis related: regulation of negative regulation of > > crystal biosynthesisrelated: regulation of negative regulation of crystal > > formation Is there a way of putting/annotating this specific item > manually, > > so that I can see it? > > If not- > > Is there a way of extracting this GO ID from the list of GO categories, > so > > that I can see the results? > > > > Thanks a lot > > Assa > > > > > >> sessionInfo() > > R version 2.12.2 (2011-02-25) > > Platform: x86_64-pc-linux-gnu (64-bit) > > > > locale: > > [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C > > [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 > > [5] LC_MONETARY=C LC_MESSAGES=en_US.UTF-8 > > [7] LC_PAPER=en_US.UTF-8 LC_NAME=C > > [9] LC_ADDRESS=C LC_TELEPHONE=C > > [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C > > > > attached base packages: > > [1] splines grid stats graphics grDevices utils datasets > > [8] methods base > > > > other attached packages: > > [1] GO.db_2.4.1 org.Mm.eg.db_2.4.6 biomaRt_2.6.0 > > [4] Heatplus_1.20.0 gplots_2.8.0 caTools_1.11 > > [7] bitops_1.0-4.1 gdata_2.8.1 gtools_2.6.2 > > [10] siggenes_1.24.0 multtest_2.7.1 Rgraphviz_1.29.0 > > [13] xtable_1.5-6 annotate_1.28.1 GOstats_2.16.0 > > [16] RSQLite_0.9-4 DBI_0.2-5 graph_1.28.0 > > [19] Category_2.16.0 AnnotationDbi_1.12.0 Biobase_2.10.0 > > > > loaded via a namespace (and not attached): > > [1] genefilter_1.32.0 GSEABase_1.12.1 MASS_7.3-11 RBGL_1.26.0 > > [5] RCurl_1.5-0 survival_2.36-5 tcltk_2.12.2 tools_2.12.2 > > [9] XML_3.2-0 > > > > [[alternative HTML version deleted]] > > > > _______________________________________________ > > Bioconductor mailing list > > Bioconductor@r-project.org > > https://stat.ethz.ch/mailman/listinfo/bioconductor > > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > > > [[alternative HTML version deleted]]
ADD REPLY
0
Entering edit mode
Hi Assa, The reason you are getting no genes is that there are no genes "directly" annotated to this term. I had the same error when I tried to look up your GO term of interest using GO or GO2EG. you need to use "org.Mm.egGO2ALLEGS" in this case to find the genes that are indirectly annotated to this term via other terms. Also keep in mind that Amigo is updated regularly, the Bioconductor packages are updated every 6 months. This may lead to some discrepancy in the results from Amigo and Bioconductor. -Robert On Fri, Apr 8, 2011 at 01:43, Assa Yeroslaviz <frymor at="" gmail.com=""> wrote: > Well well, > I am ashamed to say that it is now working. > > Apparently all I needed to do was to update the packages. > > I installed the new version of GO.db and GOstats > and it is working now. > > Also I am still getting this error when trying to find which genes are > attached to it. >> mget('GO:2000021',org.Mm.egGO) > Error in .checkKeys(value, Lkeys(x), x at ifnotfound) : > ? value for "GO:2000021" not found >> mget('GO:2000021',org.Mm.egGO2EG) > Error in .checkKeys(value, Rkeys(x), x at ifnotfound) : > ? value for "GO:2000021" not found > > So I guess the earlier error message as nothing to do with the fact that > there are no genes from the mouse genome mapped to this GO category > > When I checked in AmiGo to see if there are no genes from mouse under this > category, I found 83 genes. > Can anyone tell me than what's the meaning of this error? > > Is there a way of manually update the GO data set, so that I can map these > genes? > > Thanks > Assa > >> sessionInfo() > R version 2.12.2 (2011-02-25) > Platform: x86_64-pc-linux-gnu (64-bit) > > locale: > ?[1] LC_CTYPE=en_US.UTF-8?????? LC_NUMERIC=C > ?[3] LC_TIME=en_US.UTF-8??????? LC_COLLATE=en_US.UTF-8 > ?[5] LC_MONETARY=C????????????? LC_MESSAGES=en_US.UTF-8 > ?[7] LC_PAPER=en_US.UTF-8?????? LC_NAME=C > ?[9] LC_ADDRESS=C?????????????? LC_TELEPHONE=C > [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C > > attached base packages: > [1] splines?? grid????? stats???? graphics? grDevices utils???? datasets > [8] methods?? base > > other attached packages: > ?[1] GSEABase_1.12.1????? org.Mm.eg.db_2.4.6?? biomaRt_2.6.0 > ?[4] Heatplus_1.20.0????? ggplot2_0.8.9??????? proto_0.3-9.1 > ?[7] reshape_0.8.4??????? plyr_1.4???????????? gplots_2.8.0 > [10] caTools_1.11???????? bitops_1.0-4.1?????? gdata_2.8.1 > [13] gtools_2.6.2???????? siggenes_1.24.0????? multtest_2.7.1 > [16] Rgraphviz_1.29.0???? xtable_1.5-6???????? annotate_1.28.1 > [19] GO.db_2.4.5????????? GOstats_2.16.0?????? RSQLite_0.9-4 > [22] DBI_0.2-5??????????? graph_1.28.0???????? Category_2.16.0 > [25] AnnotationDbi_1.12.0 Biobase_2.10.0 > > loaded via a namespace (and not attached): > [1] genefilter_1.32.0 MASS_7.3-11?????? RBGL_1.26.0?????? RCurl_1.5-0 > [5] survival_2.36-5?? tools_2.12.2????? XML_3.2-0 > > On Thu, Apr 7, 2011 at 18:49, Robert M. Flight <rflight79 at="" gmail.com=""> wrote: >> >> Hi Assa, >> >> As far as I am aware, if the GO term comes up in your list, then there >> should be genes annotated to it. I did a simple test to verify that >> the GO term does exist: >> >> ?crud <- as.list(GOTERM) >> > crud$'GO:2000021' >> GOID: GO:2000021 >> Term: regulation of ion homeostasis >> Ontology: BP >> Definition: Any process that modulates the frequency, rate or extent >> of ion homeostasis. >> Synonym: regulation of electrolyte homeostasis >> Synonym: regulation of negative regulation of crystal biosynthesis >> Synonym: regulation of negative regulation of crystal formation >> >> So far so good. Now lets look to see what genes are annotated to it: >> >> > library(org.Mm.eg.db) >> > mget('GO:2000021',org.Mm.egGO) >> Error in .checkKeys(value, Lkeys(x), x at ifnotfound) : >> ?value for "GO:2000021" not found >> >> > mget('GO:2000021',org.Mm.egGO2EG) >> Error in .checkKeys(value, Rkeys(x), x at ifnotfound) : >> ?value for "GO:2000021" not found >> > mget('GO:2000021',org.Mm.egGO2ALLEGS) >> $`GO:2000021` >> ? ? ISO ? ? ?ISO ? ? ?ISO ? ? ?ISO ? ? ?IGI ? ? ?IGI ? ? ?IMP >> IGI ? ? ?ISO ? ? ?ISO ? ? ?IMP ? ? ?ISO ? ? ?ISO ? ? ?IDA >> ?"11517" ?"11684" ?"11998" ?"12000" ?"12018" ?"12028" ?"12028" >> "12043" ?"12061" ?"12257" ?"12291" ?"12349" ?"12372" ?"12389" >> ? ? ISO ? ? ?ISO ? ? ?ISO ? ? ?ISO ? ? ?ISO ? ? ?IMP ? ? ?ISO >> ISO ? ? ?IDA ? ? ?IMP ? ? ?IMP ? ? ?IGI ? ? ?IGI ? ? ?ISO >> ?"12424" ?"12558" ?"13167" ?"13489" ?"13617" ?"13666" ?"14062" >> "14126" ?"14225" ?"14225" ?"14226" ?"14629" ?"14630" ?"14652" >> ? ? ISO ? ? ?IDA ? ? ?IDA ? ? ?ISO ? ? ?IDA ? ? ?ISO ? ? ? IC >> ISO ? ? ?IMP ? ? ?IMP ? ? ?IDA ? ? ?IMP ? ? ?ISO ? ? ?ISO >> ?"15171" ?"15978" ?"16818" ?"16867" ?"16963" ?"17096" ?"17131" >> "18429" ?"18439" ?"18764" ?"19264" ?"20190" ?"21333" ?"21336" >> ? ? ISO ? ? ?ISO ? ? ?IMP ? ? ?ISO ? ? ?ISO ? ? ?TAS ? ? ?IDA >> ISO ? ? ?ISO ? ? ?ISO ? ? ?ISO ? ? ?ISO ? ? ?ISO ? ? ?ISO >> ?"21803" ?"21808" ?"21819" ?"21838" ?"22041" ?"22784" ?"23832" >> "24111" ?"26361" ?"50849" ?"54140" ?"76055" ?"76757" "108837" >> ? ? ISO ? ? ?IMP ? ? ?ISO ? ? ?ISO ? ? ?IMP ? ? ?ISO >> "217369" "225908" "233081" "238276" "259277" "317757" >> >> BTW, this was all using GO.db_2.4.5 >> >> From this information, there are no genes that are directly annotated >> to your GO term, only indirect annotations. I know this doesn't help >> your current situation, but it points towards the problem at least. I >> thought, however, when the summary was being prepared that it used the >> GO2ALLEGS mapping, and not the direct one. Perhaps someone more >> knowledgeable can figure out where in the code the error is likely to >> be? >> >> -Robert >> >> Robert M. Flight, Ph.D. >> University of Louisville Bioinformatics Laboratory >> University of Louisville >> Louisville, KY >> >> PH 502-852-1809 (HSC) >> PH 502-852-0467 (Belknap) >> EM robert.flight at louisville.edu >> EM rflight79 at gmail.com >> >> Williams and Holland's Law: >> ? ? ?? If enough data is collected, anything may be proven by >> statistical methods. >> >> >> >> On Thu, Apr 7, 2011 at 11:22, Assa Yeroslaviz <frymor at="" gmail.com=""> wrote: >> > Hi, >> > >> > I am trying to run a HyerGTest with GOstats on a mouse genome entrez >> > IDs. >> > >> > The Ids I have imported from biomart: >> > entrez_data_1 <- getBM(attributes=c("mgi_id","entrezgene"), filters= >> > "mgi_id", values = as.character(data_1$MGI),mart = mart) >> > head(entrez_data_1) >> > entrezID_Universe <-getBM(mart = mart, attributes = c("mgi_id", >> > "entrezgene"), filters ="mgi_id", values =as.character(MaxQuant18$MGI)) >> > entrezID_Universe >> > params <- new("GOHyperGParams", geneIds = >> > as.character(entrez_data_1[,2]), >> > universeGeneIds = as.character(entrezID_Universe[,2]), annotation = >> > "org.Mm.eg.db", ontology = "BP", pvalueCutoff = 0.05, conditional = >> > FALSE, >> > testDirection = "over") >> > I Than tried to run the HyperGTest command with success >> > MmOverBP <- hyperGTest(paramsBP) >> > MmOverBP >> > Gene to GO BP ?test for over-representation >> > 3146 GO BP ids tested (118 have p < 0.05) >> > Selected gene set size: 1006 >> > ? ?Gene universe size: 2935 >> > ? ?Annotation package: org.Mm.eg >> > but than: >> > summary(MmOverBP) >> >> summary(MmOverBP) >> > Error in .checkKeys(value, Lkeys(x), x at ifnotfound) : >> > ?value for "GO:2000021" not found >> > >> > As far as I know, I have the latest version of both packages. I looked >> > in >> > AmiGO whether this GO Id exists: it does. >> > AccessionGO:2000021OntologyBiological ProcessSynonymsrelated: regulation >> > of >> > electrolyte homeostasis related: regulation of negative regulation of >> > crystal biosynthesisrelated: regulation of negative regulation of >> > crystal >> > formation Is there a way of putting/annotating this specific item >> > manually, >> > so that I can see it? >> > If not- >> > Is there a way of extracting this GO ID from the list of GO categories, >> > so >> > that I can see the results? >> > >> > Thanks a lot >> > Assa >> > >> > >> >> sessionInfo() >> > R version 2.12.2 (2011-02-25) >> > Platform: x86_64-pc-linux-gnu (64-bit) >> > >> > locale: >> > ?[1] LC_CTYPE=en_US.UTF-8 ? ? ? LC_NUMERIC=C >> > ?[3] LC_TIME=en_US.UTF-8 ? ? ? ?LC_COLLATE=en_US.UTF-8 >> > ?[5] LC_MONETARY=C ? ? ? ? ? ? ?LC_MESSAGES=en_US.UTF-8 >> > ?[7] LC_PAPER=en_US.UTF-8 ? ? ? LC_NAME=C >> > ?[9] LC_ADDRESS=C ? ? ? ? ? ? ? LC_TELEPHONE=C >> > [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C >> > >> > attached base packages: >> > [1] splines ? grid ? ? ?stats ? ? graphics ?grDevices utils ? ? datasets >> > [8] methods ? base >> > >> > other attached packages: >> > ?[1] GO.db_2.4.1 ? ? ? ? ?org.Mm.eg.db_2.4.6 ? biomaRt_2.6.0 >> > ?[4] Heatplus_1.20.0 ? ? ?gplots_2.8.0 ? ? ? ? caTools_1.11 >> > ?[7] bitops_1.0-4.1 ? ? ? gdata_2.8.1 ? ? ? ? ?gtools_2.6.2 >> > [10] siggenes_1.24.0 ? ? ?multtest_2.7.1 ? ? ? Rgraphviz_1.29.0 >> > [13] xtable_1.5-6 ? ? ? ? annotate_1.28.1 ? ? ?GOstats_2.16.0 >> > [16] RSQLite_0.9-4 ? ? ? ?DBI_0.2-5 ? ? ? ? ? ?graph_1.28.0 >> > [19] Category_2.16.0 ? ? ?AnnotationDbi_1.12.0 Biobase_2.10.0 >> > >> > loaded via a namespace (and not attached): >> > [1] genefilter_1.32.0 GSEABase_1.12.1 ? MASS_7.3-11 ? ? ? RBGL_1.26.0 >> > [5] RCurl_1.5-0 ? ? ? survival_2.36-5 ? tcltk_2.12.2 ? ? ?tools_2.12.2 >> > [9] XML_3.2-0 >> > >> > ? ? ? ?[[alternative HTML version deleted]] >> > >> > _______________________________________________ >> > Bioconductor mailing list >> > Bioconductor at r-project.org >> > https://stat.ethz.ch/mailman/listinfo/bioconductor >> > Search the archives: >> > http://news.gmane.org/gmane.science.biology.informatics.conductor >> > > >
ADD REPLY
0
Entering edit mode

I have run into this error, but I believe all of my packages are up to date. After following the suggestions throughout this previous post, I think I have a GO ID from org.Mm.eg.db that is not in GO.db. Any suggestions on how to extract the summary(hgOver) information would be appreciated.
 

> library("GOstats")

Attaching package: ‘GOstats’

The following object is masked from ‘package:AnnotationDbi’:

    makeGOGraph

> library("AnnotationDbi")
> library("org.Mm.eg.db")

> total.genes <- dput(as.character(mito.prot.sort.entrez.nodups.clus$ENTREZID))

> hgCutoff = 0.001
> params <- new("GOHyperGParams",
+               geneIds=test.genes,
+               universeGeneIds=total.genes,
+               ontology="BP",
+               annotation= "org.Mm.eg.db",
+               pvalueCutoff=hgCutoff,
+               conditional=FALSE,
+               testDirection="over")
> paramsMF <- params
> ontology(paramsMF) <- "MF"
> paramsCC <- params
> ontology(paramsCC) <- "CC"
> hgOver <- hyperGTest(paramsCC)
> df <- summary(hgOver,categorySize=10)
Error in .checkKeys(value, Lkeys(x), x@ifnotfound) : 
  value for "GO:0097708" not found
> x <- as.list(GOTERM)
> x$'GO:0097708'
NULL
> mget('GO:0097708',org.Mm.egGO)
Error in .checkKeys(value, Lkeys(x), x@ifnotfound) : 
  value for "GO:0097708" not found
> mget('GO:0097708',org.Mm.egGO2ALLEGS)
$`GO:0097708`
[removed long output]

> sessionInfo()
R version 3.3.3 (2017-03-06)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows >= 8 x64 (build 9200)

locale:
[1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United States.1252    LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C                           LC_TIME=English_United States.1252    

attached base packages:
[1] parallel  stats4    stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] GO.db_3.4.0          org.Mm.eg.db_3.4.0   GOstats_2.40.0       UniProt.ws_2.14.0    RCurl_1.95-4.8      
 [6] bitops_1.0-6         RSQLite_1.1-2        Category_2.40.0      Matrix_1.2-8         GSEABase_1.36.0     
[11] graph_1.52.0         annotate_1.52.1      XML_3.98-1.5         AnnotationDbi_1.36.2 IRanges_2.8.1       
[16] S4Vectors_0.12.1     Biobase_2.34.0       BiocGenerics_0.20.0 

loaded via a namespace (and not attached):
 [1] Rcpp_0.12.9            magrittr_1.5           splines_3.3.3          xtable_1.8-2           lattice_0.20-34       
 [6] R6_2.2.0               dplyr_0.5.0            tools_3.3.3            grid_3.3.3             AnnotationForge_1.16.1
[11] DBI_0.6                genefilter_1.56.0      assertthat_0.1         survival_2.40-1        RBGL_1.50.0           
[16] digest_0.6.12          tibble_1.2             memoise_1.0.0         

ADD REPLY

Login before adding your answer.

Traffic: 640 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6