topGO Fisher test error
1
0
Entering edit mode
Guest User ★ 13k
@guest-user-4897
Last seen 9.7 years ago
Hi list! Hope someone can help me on this as i've been stuck for a solid couple of days, i've read other posts without finding my issue. I want to perform an enrichment analysis of a list of genes i found from a microarray experiment. The topGOdata object seems to be generated without errors but then i cant perform Fisher test on it.I pasted everything from the very start sorry for that but maybe i did something wrong.. x<-hugene11sttranscriptclusterENTREZID probekeys<-Lkeys(x)# gene universe (probeset IDs) x<-hugene11sttranscriptclusterGO mappedGO<-mappedkeys(x) probe2GO<-as.list(x[mappedGO]) # list of probe2GO geneList<-factor(as.integer(probekeys %in% intgenes) # intgenes= my list of interesting probeIDs names(geneList)<-probekeys str(geneList) Factor w/ 2 levels "0","1": 1 1 1 1 1 1 1 1 1 1 ... - attr(*, "names")= chr [1:33295] "7892501" "7892502" "7892503" "7892504" ... GOdata<-new("topGOdata", ontology="BP", allGenes= geneList, annot = annFUN.gene2GO, gene2GO=probe2GO) GOdata ------------------------- topGOdata object ------------------------- Description: - Ontology: - BP 33295 available genes (all genes from the array): - symbol: 7892501 7892502 7892503 7892504 7892505 ... - 898 significant genes. 11077 feasible genes (genes that can be used in the analysis): - symbol: 7896740 7896754 7896779 7896822 7896921 ... - 530 significant genes. GO graph (nodes with at least 1 genes): - a graph with directed edges - number of nodes = 10376 - number of edges = 22233 ------------------------- topGOdata object ------------------------- test.stat <- new("classicCount", testStatistic = GOFisherTest, name = "Fisher test") resultFisher <- getSigGroups(GOdata, test.stat) -- Classic Algorithm -- the algorithm is scoring 2569 nontrivial nodes parameters: test statistic: Fisher test Error in fisher.test(contMat, alternative = "greater") : all entries of 'x' must be nonnegative and finite This is the error a get...and i dont know what it means. Any help would me much appreciated, sorry if i've been too long! Many thanks! Bruno -- output of sessionInfo(): > sessionInfo() R version 2.14.1 (2011-12-22) Platform: x86_64-pc-linux-gnu (64-bit) locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=C LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] topGO_2.6.0 SparseM_0.97 [3] GO.db_2.6.1 graph_1.32.0 [5] hugene11sttranscriptcluster.db_4.0.1 org.Hs.eg.db_2.6.4 [7] RSQLite_0.11.2 DBI_0.2-5 [9] AnnotationDbi_1.16.19 Biobase_2.14.0 loaded via a namespace (and not attached): [1] grid_2.14.1 IRanges_1.12.6 lattice_0.20-0 tools_2.14.1 -- Sent via the guest posting facility at bioconductor.org.
GO graph GO graph • 1.8k views
ADD COMMENT
1
Entering edit mode
Adrian Alexa ▴ 400
@adrian-alexa-936
Last seen 9.7 years ago
Hi Bruno, there are a couple of issues with your code. First, the problem stands in the way 'probbe2GO' is formatted. It should be a named list, where the names are the probe ID and a list entry is a character string of GO IDs (the ones to which a probe ID is annotated). However, your 'probe2GO' object is different and this results in a faulty topGOdata object. For example: > str(head(probe2GO, 2)) List of 2 $ 7896742:List of 5 ..$ GO:0007049:List of 3 .. ..$ GOID : chr "GO:0007049" .. ..$ Evidence: chr "IEA" .. ..$ Ontology: chr "BP" ..$ GO:0051301:List of 3 .. ..$ GOID : chr "GO:0051301" .. ..$ Evidence: chr "IEA" .. ..$ Ontology: chr "BP" ..$ GO:0031105:List of 3 .. ..$ GOID : chr "GO:0031105" .. ..$ Evidence: chr "IEA" .. ..$ Ontology: chr "CC" ..$ GO:0005515:List of 3 .. ..$ GOID : chr "GO:0005515" .. ..$ Evidence: chr "IPI" .. ..$ Ontology: chr "MF" ..$ GO:0005525:List of 3 .. ..$ GOID : chr "GO:0005525" .. ..$ Evidence: chr "IEA" .. ..$ Ontology: chr "MF" $ 7896779:List of 13 ..$ GO:0030036:List of 3 .. ..$ GOID : chr "GO:0030036" .. ..$ Evidence: chr "ISS" .. ..$ Ontology: chr "BP" .............................................. what you need is something like this: > probe2GO <- lapply(probe2GO, names) > str(head(probe2GO, 2)) List of 2 $ 7896742: chr [1:5] "GO:0007049" "GO:0051301" "GO:0031105" "GO:0005515" ... $ 7896779: chr [1:13] "GO:0030036" "GO:0016567" "GO:0007420" "GO:0005886" ... With this, you can instantiate a topGOData instance and perform the statistical test. > intgenes <- sample(probekeys, 1000) > geneList<-factor(as.integer(probekeys %in% intgenes)) # intgenes= my list of interesting probeIDs > names(geneList)<-probekeys > str(geneList) Factor w/ 2 levels "0","1": 1 1 1 1 1 1 1 2 1 1 ... - attr(*, "names")= chr [1:33295] "7892501" "7892502" "7892503" "7892504" ... > GOdata<-new("topGOdata", ontology="BP", allGenes= geneList, annot = annFUN.gene2GO, gene2GO=probe2GO) Building most specific GOs ..... ( 8788 GO terms found. ) Build GO DAG topology .......... ( 11951 GO terms and 27203 relations. ) Annotating nodes ............... ( 15500 genes annotated to the GO terms. ) > test.stat <- new("classicCount", testStatistic = GOFisherTest, name = "Fisher test") > resultFisher <- getSigGroups(GOdata, test.stat) -- Classic Algorithm -- the algorithm is scoring 4097 nontrivial nodes parameters: test statistic: Fisher test > resultFisher Description: Ontology: BP 'classic' algorithm with the 'Fisher test' test 11951 GO terms scored: 59 terms with p < 0.01 Annotation data: Annotated genes: 15500 Significant genes: 469 Min. no. of genes annotated to a GO: 1 Nontrivial nodes: 4097 So add the ' probe2GO <- lapply(probe2GO, names)' line before building the topGOData object. Now, all the above can be done a lot easier. You don't need to build the probe-to-GO mapping yourself. There are a few annotation functions provided by topGO which will do that for you. Please read the help of 'annFUN' and Section 4 of the package vignette. So, you can get the same results by doing something like: ## gene universe (probeset IDs) probekeys <- Lkeys(hugene11sttranscriptclusterENTREZID) intgenes <- sample(probekeys, 1000) geneList <- factor(as.integer(probekeys %in% intgenes)) # intgenes= my list of interesting probeIDs names(geneList) <- probekeys ## use the annFUN.db for a Bioconductor annotation package GOdata <- new("topGOdata", ontology = "BP", allGenes = geneList, annot = annFUN.db, affyLib = "hugene11sttranscriptcluster") GOdata ## you can use runTest() instead of the new("classicCount", ...) and getSigGroups(...) resultFisher <- runTest(GOdata, algorithm = "classic", statistic = "fisher") Hope this helps. Best regards, Adrian [[alternative HTML version deleted]]
ADD COMMENT

Login before adding your answer.

Traffic: 821 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6