hyperGTest on KEGG and PFAM with org.XX.eg annotations
Entering edit mode
James F. Reid ▴ 610
Last seen 7.4 years ago
Dear list, hyperGTest behaves differently when using org.XX.eg.db packages compared to microarray based ones, like hgu95av2.db for example, for doing a KEGG analysis. hyperGTest complains if the annotation string does not end with the suffix ".db", it works if you add it but then you can't run a summary on the result. A quick fix is to re-assign the ".db"-less string to the annotation slot of the hyperGTest result. So I am wondering if I am doing something wrong of if it is a bug. For the PFAM analysis everything works fine except that in the summary output the Term (Description) is just the PFAMID which is not very useful for interpretation. I think this could easily be fixed by using the same approach as for the KEGG output in the PFAMHyperGResult summary method: ## implicit require("PFAM.db") pfamEnv <- getAnnMap("DE", "PFAM", load=TRUE) pfamTerms <- unlist(mget(pfamIds, pfamEnv, ifnotfound=NA)) Many thanks, James. Here is a session reporting the problem: library("Category") library("org.Hs.eg.db") set.seed(123) geneBackground <- Lkeys(org.Hs.egPATH) geneList <- sample(geneBackground, 500) params <- new("KEGGHyperGParams", geneIds = geneList, universeGeneIds = geneBackground, annotation = "org.Hs.eg") hgKEGG <- hyperGTest(params) # Error in get(paste(lib, name, sep = "")) : # variable "org.Hs.egPATH2PROBE" was not found params at annotation <- "org.Hs.eg.db" hgKEGG <- hyperGTest(params) summary(hgKEGG) # Error in get(paste(annotation(object), "ORGANISM", sep = "")) : # variable "org.Hs.eg.dbORGANISM" was not found hgKEGG at annotation <- "org.Hs.eg" summary(hgKEGG) # KEGGID Pvalue OddsRatio ExpCount Count Size #1 05130 0.003282103 7.314332 0.6239536 4 51 #2 05131 0.003282103 7.314332 0.6239536 4 51 # Term #1 Pathogenic Escherichia coli infection - EHEC #2 Pathogenic Escherichia coli infection - EPEC > sessionInfo() R version 2.8.0 (2008-10-20) i486-pc-linux-gnu locale: LC_CTYPE=en_US.UTF-8;LC_NUMERIC=C;LC_TIME=en_US.UTF-8;LC_COLLATE=en_US .UTF-8;LC_MONETARY=C;LC_MESSAGES=en_US.UTF-8;LC_PAPER=en_US.UTF-8;LC_N AME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF-8;LC_IDENTI FICATION=C attached base packages: [1] splines tools stats graphics grDevices utils datasets [8] methods base other attached packages: [1] KEGG.db_2.2.5 org.Hs.eg.db_2.2.6 RSQLite_0.7-1 [4] DBI_0.2-4 Category_2.8.1 genefilter_1.22.0 [7] survival_2.34-1 annotate_1.20.1 xtable_1.5-4 [10] AnnotationDbi_1.4.1 graph_1.20.0 Biobase_2.2.1 loaded via a namespace (and not attached): [1] cluster_1.11.11 GSEABase_1.4.0 RBGL_1.18.0 XML_1.98-1
Microarray Annotation Escherichia coli hgu95av2

