GO annotation/analysis for ath1121501
1
0
Entering edit mode
Ann Hess ▴ 340
@ann-hess-251
Last seen 9.6 years ago
I am working with data from ath1121501 (arabidopsis) arrays and I would like to do the following: 1. Subset a list of genes based on GO terms. For example, how many (and which) a given list belong to MF=metabolism. 2. Create a pie chart of the distribution of GO terms for my list. 3. Find statistically over-represented GO terms. 4. Find pathway information for my list. As simple as goal #1 appears to be, I am not sure how to subset a list by GO term. I am not even sure what GO annotation is available for ath1121501 in BioConductor. In an attempt to accomplish goal #2, I tried using the function ontoCompare from the goTools package, but got an error: >length(DEGList) [1] 1881 >res<-ontoCompare(DEGList,probeType="ath1121501",plot=TRUE) [1] "Starting ontoCompare..." Error in as.vector(x, mode) : invalid argument 'mode' In an attempt to accomplish goal #3, I tried using the GoHyperG function from the GOstats package, but the locus link ID information does not appear to be available for ath1121501 (this has been addressed in previous postings). Are there alternatives that can be used for ath1121501? I was hoping to use the biomaRt package to get pathway information, but it doesn't look like it contains annotation for arabidopsis. Any suggestions would be sincerely appreciated! Ann
Annotation GO ath1121501 goTools GOstats biomaRt Annotation GO ath1121501 goTools • 1.2k views
ADD COMMENT
0
Entering edit mode
Nianhua Li ▴ 870
@nianhua-li-1606
Last seen 9.6 years ago
Ann Hess <hess at="" ...=""> writes: > > I am working with data from ath1121501 (arabidopsis) arrays and I would > like to do the following: > > 1. Subset a list of genes based on GO terms. For example, how many > (and which) a given list belong to MF=metabolism. ## find the GO Identifier for "metabolism" library(GO) myGoTerm <- "metabolism" myGoID <- unlist(eapply(GOTERM, function(g) if (g at Term == myGoTerm) TRUE else FALSE)) myGoID <- names(myGoID[myGoID]) print(myGoID) ## or if you want to find the GO term containing "metabolism" ##x <- eapply(GOTERM, function(g) if (length(grep("metabolism", g at Term))>0) cat(g at GOID, " ", g at Term, "\n")) ## get probeset IDs associated with myGoID library(ath1121501) myProbeID <- get(myGoID, ath1121501GO2RROBE) myAllProbeID <- get(myGoID, ath1121501GO2ALLPROBES) ?ath1121501GO2RROBE ?ath1121501GO2ALLPROBES > 2. Create a pie chart of the distribution of GO terms for my list. > 3. Find statistically over-represented GO terms. library(Category) library(ath1121501) library(GO) set.seed(123) probes <- ls(ath1121501ACCNUM) probes <- sample(probes, 100) locusList <- unique(unlist(mget(probes, ath1121501ACCNUM))) ath1121501LOCUSID <- ath1121501ACCNUM ans <- geneGoHyperGeoTest(locusList, "ath1121501", "BP") ?geneGoHyperGeoTest class?GeneGoHyperGeoTestResult > 4. Find pathway information for my list. probe-to-AraCyc mapping in ath1121501PATH probe-to-gene mapping in ath1121501ACCNUM If you want pathway information from KEGG, use AnnBuilder 1.11.8 to build your own ath1121501, and check environment ath1121501PATH and ath1121501ARACYC. > > In an attempt to accomplish goal #3, I tried using the GoHyperG function > from the GOstats package, but the locus link ID information does not > appear to be available for ath1121501 (this has been addressed in previous > postings). Are there alternatives that can be used for ath1121501? > For Arabidopsis annotation packages, AGI locus identifier is used to retrieve annotations for gene, i.e. Entrez Gene ID or GenBank accession are not used. Therefore, there is no xxxxLOCUSID environment. xxxxACCNUM gives probe-to AGI locus mapping. hope it helps nianhua
ADD COMMENT

Login before adding your answer.

Traffic: 921 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6