new topGO results using GO.db very different from old ones using GO
2
0
Entering edit mode
@joern-toedling-1244
Last seen 10.3 years ago
Dear all, I would appreciate any suggestion on the following issue. I have noticed a major inconsistency between new and older topGO results. For the older ones, topGO used the "GO" package, while it uses "GO.db" for the new results I can't figure out whether it is a problem with topGO only or whether there are some serious inconsistencies between GO and GO.db Here is the source code I used: library("topGO") ## load list of genes of interest load("brainOnlyGenes.RData") ## load genereal gene-to-GO mapping and universe of genes to use in analysis: load("mm9gene2GO.RData") load("arrayGenesWithGO.RData") ## then the function to call topGO and to return a nice result table: sigGOTable <- function(selGenes, GOgenes=arrayGenesWithGO, gene2GO=mm9.gene2GO[arrayGenesWithGO], ontology="BP", maxP=0.001) { inGenes <- factor(as.integer(GOgenes %in% selGenes)) names(inGenes) <- GOgenes GOdata <- new("topGOdata", ontology=ontology, allGenes=inGenes, annot=annFUN.gene2GO, gene2GO=gene2GO) myTestStat <- new("elimCount", testStatistic=GOFisherTest, name="Fisher test", cutOff=maxP) mySigGroups <- getSigGroups(GOdata, myTestStat) sTab <- GenTable(GOdata, mySigGroups, topNodes=length(usedGO(GOdata))) names(sTab)[length(sTab)] <- "p.value" return(subset(sTab, as.numeric(p.value) < maxP)) }# ## call it: (brainRes <- sigGOTable(brainOnlyGenes)) # with topGO_1.4.0 using GO_2.0.1 # this is: # GO.ID Term Annotated Significant Expected p.value # 1 GO:0007268 synaptic transmission 136 44 24.46 3.0e-05 # 2 GO:0007610 behavior 180 54 32.38 4.4e-05 # 3 GO:0007409 axonogenesis 119 38 21.41 0.00014 # 4 GO:0006887 exocytosis 40 17 7.20 0.00026 # 5 GO:0007420 brain development 136 40 24.46 0.00066 # which kind of make sense if it somehow to annotate a list of interesting genes when investigating brain cells ## now unfortunately using all the same gene list, universe and gene- to-GO mapping, and the same function as above ## with topGO_1.9.0 using GO.db_2.2.0, the result is: # GO.ID Term Annotated Significant Expected p.value # 1 GO:0007268 mitochondrial genome maintenance 137 44 24.65 3.7e-05 # 2 GO:0007610 reproduction 180 54 32.39 4.4e-05 # 3 GO:0007409 single strand break repair 119 38 21.41 0.00014 # 4 GO:0006887 regulation of DNA recombination 40 17 7.20 0.00026 # 5 GO:0007420 regulation of mitotic recombination 136 40 24.47 0.00066 # which is obviously very, very different Does anyone have an educated guess what is going on? Could it be a bug a in topGO? Or is the information in GO.db really different from the one in GO, and in that case which one is the right one? Best regards, Joern
GO annotate topGO BRAIN GO annotate topGO BRAIN • 1.7k views
ADD COMMENT
0
Entering edit mode
Adrian Alexa ▴ 400
@adrian-alexa-936
Last seen 10.3 years ago
Hi Joern, seems like a bug in topGO. The GO IDs are the same in both cases, but the names are wrong in the second table. GO:0007610 is not "reproduction" but "behavior". At the first sight looks like a bug in the the GenTable function. but I need to look closely. There will be an update soon. Thanks for the report and I apologies for the bug, Adrian On Fri, May 2, 2008 at 2:48 PM, Joern Toedling <toedling at="" ebi.ac.uk=""> wrote: > Dear all, > I would appreciate any suggestion on the following issue. I have noticed a > major inconsistency between new and older topGO results. For the older ones, > topGO used the "GO" package, while it uses "GO.db" for the new results I > can't figure out whether it is a problem with topGO only or whether there > are some serious inconsistencies between GO and GO.db > > Here is the source code I used: > > library("topGO") > > ## load list of genes of interest > > load("brainOnlyGenes.RData") > > ## load genereal gene-to-GO mapping and universe of genes to use in > analysis: > > load("mm9gene2GO.RData") > > load("arrayGenesWithGO.RData") > > ## then the function to call topGO and to return a nice result table: > > sigGOTable <- function(selGenes, GOgenes=arrayGenesWithGO, > gene2GO=mm9.gene2GO[arrayGenesWithGO], ontology="BP", maxP=0.001) > > { > > inGenes <- factor(as.integer(GOgenes %in% selGenes)) > > names(inGenes) <- GOgenes > > GOdata <- new("topGOdata", ontology=ontology, allGenes=inGenes, > annot=annFUN.gene2GO, gene2GO=gene2GO) > > myTestStat <- new("elimCount", testStatistic=GOFisherTest, > name="Fisher test", cutOff=maxP) > > mySigGroups <- getSigGroups(GOdata, myTestStat) > > sTab <- GenTable(GOdata, mySigGroups, topNodes=length(usedGO(GOdata))) > > names(sTab)[length(sTab)] <- "p.value" > > return(subset(sTab, as.numeric(p.value) < maxP)) > > }# > > ## call it: > > (brainRes <- sigGOTable(brainOnlyGenes)) > > # with topGO_1.4.0 using GO_2.0.1 > > # this is: > > # GO.ID Term Annotated Significant Expected > p.value > # 1 GO:0007268 synaptic transmission 136 44 24.46 > 3.0e-05 > # 2 GO:0007610 behavior 180 54 32.38 > 4.4e-05 > # 3 GO:0007409 axonogenesis 119 38 21.41 > 0.00014 > # 4 GO:0006887 exocytosis 40 17 7.20 > 0.00026 > # 5 GO:0007420 brain development 136 40 24.46 > 0.00066 > > > # which kind of make sense if it somehow to annotate a list of interesting > genes when investigating brain cells > > ## now unfortunately using all the same gene list, universe and gene-to-GO > mapping, and the same function as above > > ## with topGO_1.9.0 using GO.db_2.2.0, the result is: > > # GO.ID Term Annotated Significant > Expected p.value > # 1 GO:0007268 mitochondrial genome maintenance 137 44 > 24.65 3.7e-05 > # 2 GO:0007610 reproduction 180 54 > 32.39 4.4e-05 > # 3 GO:0007409 single strand break repair 119 38 > 21.41 0.00014 > # 4 GO:0006887 regulation of DNA recombination 40 17 > 7.20 0.00026 > # 5 GO:0007420 regulation of mitotic recombination 136 40 > 24.47 0.00066 > > > # which is obviously very, very different > > > Does anyone have an educated guess what is going on? Could it be a bug a in > topGO? Or is the information in GO.db really different from the one in GO, > and in that case which one is the right one? > > Best regards, > Joern > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor >
ADD COMMENT
0
Entering edit mode
Tony Chiang ▴ 570
@tony-chiang-1769
Last seen 10.3 years ago
Hi Joern, One thing to do first is to check out topGO (both the current version and the older version) and look at svn diff on the functions that you are using from the package. If there are no (significant) changes within the functions, then I would presume that GO and GO.db are the culprits. I am sure that both the maintainers of topGO and of the GO annotation package would greatly appreciate it if someone found a bug like this and reported it! Tony On Fri, May 2, 2008 at 3:48 PM, Joern Toedling <toedling@ebi.ac.uk> wrote: > Dear all, > I would appreciate any suggestion on the following issue. I have noticed a > major inconsistency between new and older topGO results. For the older ones, > topGO used the "GO" package, while it uses "GO.db" for the new results I > can't figure out whether it is a problem with topGO only or whether there > are some serious inconsistencies between GO and GO.db > > Here is the source code I used: > > library("topGO") > > ## load list of genes of interest > > load("brainOnlyGenes.RData") > > ## load genereal gene-to-GO mapping and universe of genes to use in > analysis: > > load("mm9gene2GO.RData") > > load("arrayGenesWithGO.RData") > > ## then the function to call topGO and to return a nice result table: > > sigGOTable <- function(selGenes, GOgenes=arrayGenesWithGO, > gene2GO=mm9.gene2GO[arrayGenesWithGO], ontology="BP", maxP=0.001) > > { > > inGenes <- factor(as.integer(GOgenes %in% selGenes)) > > names(inGenes) <- GOgenes > > GOdata <- new("topGOdata", ontology=ontology, allGenes=inGenes, > annot=annFUN.gene2GO, gene2GO=gene2GO) > > myTestStat <- new("elimCount", testStatistic=GOFisherTest, > name="Fisher test", cutOff=maxP) > > mySigGroups <- getSigGroups(GOdata, myTestStat) > > sTab <- GenTable(GOdata, mySigGroups, topNodes=length(usedGO(GOdata))) > > names(sTab)[length(sTab)] <- "p.value" > > return(subset(sTab, as.numeric(p.value) < maxP)) > > }# > > ## call it: > > (brainRes <- sigGOTable(brainOnlyGenes)) > > # with topGO_1.4.0 using GO_2.0.1 > > # this is: > > # GO.ID Term Annotated Significant Expected > p.value > # 1 GO:0007268 synaptic transmission 136 44 24.46 > 3.0e-05 > # 2 GO:0007610 behavior 180 54 32.38 > 4.4e-05 > # 3 GO:0007409 axonogenesis 119 38 21.41 > 0.00014 > # 4 GO:0006887 exocytosis 40 17 7.20 > 0.00026 > # 5 GO:0007420 brain development 136 40 24.46 > 0.00066 > > > # which kind of make sense if it somehow to annotate a list of interesting > genes when investigating brain cells > > ## now unfortunately using all the same gene list, universe and gene-to-GO > mapping, and the same function as above > > ## with topGO_1.9.0 using GO.db_2.2.0, the result is: > > # GO.ID Term Annotated Significant > Expected p.value > # 1 GO:0007268 mitochondrial genome maintenance 137 44 > 24.65 3.7e-05 > # 2 GO:0007610 reproduction 180 54 > 32.39 4.4e-05 > # 3 GO:0007409 single strand break repair 119 38 > 21.41 0.00014 > # 4 GO:0006887 regulation of DNA recombination 40 17 > 7.20 0.00026 > # 5 GO:0007420 regulation of mitotic recombination 136 40 > 24.47 0.00066 > > > # which is obviously very, very different > > > Does anyone have an educated guess what is going on? Could it be a bug a > in topGO? Or is the information in GO.db really different from the one in > GO, and in that case which one is the right one? > > Best regards, > Joern > > _______________________________________________ > Bioconductor mailing list > Bioconductor@stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > [[alternative HTML version deleted]]
ADD COMMENT

Login before adding your answer.

Traffic: 571 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6