incorrect number of dimensions in probeSetSummary function from GOstats
1
0
Entering edit mode
ramonmassoni ▴ 10
@ramonmassoni-19642
Last seen 23 months ago

Hi all,

I am running a gene ontology enrichment analysis with the GOstats package as follows:

```{r} params <- new("GOHyperGParams", geneIds = targetentrez, universeGeneIds = universeentrez, annotation = "org.Hs.eg.db", ontology = "BP", pvalueCutoff = 1, conditional = TRUE, testDirection = "over") hgOvergt <- hyperGTest(params) goresults <- summary(hgOver_gt)


Which yields a nice data frame with the enriched GO terms. Now, I want to know which genes in my target list are found in each enriched term:

```{r}
probeSetSummary(hgOver_gt)

But I get the following error:

{r} Error in `[.default`(tab, , 1) : incorrect number of dimensions

I'm not sure if this is a bug or I am doing something wrong. I would be most grateful if you could shed some light into the matter.

Best

Ramon

GOstats GO R • 306 views
ADD COMMENT
0
Entering edit mode
@james-w-macdonald-5106
Last seen 2 days ago
United States

The probeSetSummary function is intended (as its name sort of implies) to tell you what microarray probes contributed to a particular GO term being significant. You have NCBI Gene IDs, which are easy enough to directly map. Something like this:

## set seed for consistency
> set.seed(0xabeef)
> univ <- keys(org.Hs.eg.db)
> samp <- univ[sample(1:length(univ), 150)]
## no reason to use a p = 1!
> p <- new("GOHyperGParams", geneIds = samp, universeGeneIds = univ, annotation = "org.Hs.eg.db", ontology = "BP", pvalueCutoff = 0.05, conditional = TRUE, testDirection = "over")
> hyp <- hyperGTest(p)
> z <- summary(hyp)
> z <- z[z$Size >= 10,] ## Don't trust overly small GO terms (e.g., too few genes in a term)!
> head(z)
       GOBPID      Pvalue OddsRatio   ExpCount Count Size
1  GO:0090398 0.001745174  13.73185 0.23813605     3   78
2  GO:0000002 0.002034070  33.80545 0.06716658     2   22
3  GO:0014850 0.002420820  30.72893 0.07327263     2   24
10 GO:0006829 0.003060633  27.03709 0.08243171     2   27
11 GO:0006882 0.005397670  19.87059 0.10990894     2   36
31 GO:0060999 0.007979571  16.07879 0.13433315     2   44
                                                 Term
1                                 cellular senescence
2                    mitochondrial genome maintenance
3                         response to muscle activity
10                                 zinc ion transport
11                      cellular zinc ion homeostasis
31 positive regulation of dendritic spine development
> dim(z)
[1] 47  7
## map GO IDs to NCBI Gene IDs. Use GOALL, not GO, because we want indirect mappings as well.
> zlst <- mapIds(org.Hs.eg.db, z$GOBPID, "ENTREZID", "GOALL", multiVals = "list")
## convert to a list of data.frames with Gene ID in first column and boolean for significance
> zlst <- lapply(zlst, function(x) data.frame(ENTREZID = x, SIG = x %in% samp))
## or you could just subset
> zlst <- lapply(zlst, function(x) x[x %in% samp])
ADD COMMENT

Login before adding your answer.

Traffic: 153 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6