Dear Fuyou,
As Dr. Gilbert Feng suggested, you can create a customized GeneAnswers object as explained in the package vignette (page 5) and R code chunk #3. Since you are interested in GO terms, the following solution might be what you want. Remember the second parameter in geneAnswersBuilder is annotationLib that requires either the name of given annotation library file or user provided annotation list? So you need to provide an annotation list for GO terms where the name of each list element is a GO term and each list element contains a vector of TAIR IDs. Instead of using Entrez gene IDs, you can use org.At.tairGO2ALLTAIRS to generate that list and supply it to the builder function. You also need to specify totalGeneNumber that is the number of unique TAIR IDs associated with those GO terms. After that you will have to manually set AnnLib and CategoryType property in the object to org.At.tair.db and GO, respectively (please take a look at the package manual for details). Once this is done, you are ready to use other function for downstream analysis.
HTH.
Lei
————————
Lei Huang, Ph.D.
Bioinformatician
Center for Research Informatics
University of Chicago
Email: lhuang7 at uchicago.edu
On Jun 9, 2015, at 5:56 PM, Fu, Fuyou <fu115@purdue.edu> wrote:
Dear Lei,
Thanks for your help. I can build my Instance according to your suggestion. But I have some problem next step as following.
> sdgGeneInput<-read.csv("SDG00.csv", header=T)
> sdgExpr<-read.csv("SDG00Ex.csv", header=T)
> x <- org.At.tairENTREZID
> # Get the ORF IDs that are mapped to an Entrez Gene ID
> mapped_genes <- mappedkeys(x)
> # Convert to a list
> xx <- as.list(x[mapped_genes])
> q<-geneAnswersBuilder(sdgGeneInput, xx, categoryType ='GO.BP', testType = c("hyperG",), totalGeneNumber=NULL, geneExpressionProfile =sdgExpr, pvalueT = 0.05, verbose=FALSE)
Error in c("hyperG", ) : argument 2 is empty
> q<-geneAnswersBuilder(sdgGeneInput, xx, categoryType ='GO.BP', testType = c("hyperG"), totalGeneNumber=NULL, geneExpressionProfile =sdgExpr, pvalueT = 0.05, verbose=FALSE)
[1] "categoryType is set User defined"
[1] "GeneAnswers instance has been successfully created!"
> class(q)
[1] "GeneAnswers"
attr(,"package")
[1] "GeneAnswers"
> getAnnLib(q)
NULL
> getCategoryType(q)
[1] "User defined"
> xx <- geneAnswersReadable(q)
[1] "Mapping geneInput ..."
Error in switch(sub("org.*[:.:]", "", libname), eg = "EG", tair = "TAIR", :
EXPR must be a length 1 vector
> a <- geneAnswersReadable(q)
[1] "Mapping geneInput ..."
Error in switch(sub("org.*[:.:]", "", libname), eg = "EG", tair = "TAIR", :
EXPR must be a length 1 vector
I am very appreciated for your help.
Fuyou
PU
From: Huang, Lei [BSD] - CRI [lhuang7@uchicago.edu]
Sent: Tuesday, June 09, 2015 5:46 PM
To: Fu, Fuyou
Cc: Gilbert Feng
Subject: Re: GenesAnswers for Arabidopsis!
Dear Fuyou,
You can follow Gilbert’s suggestion to build a customized GeneAnswers instance. Or you can explicitly convert TAIR ID (if this is the ID you have) to Entrez gene ID by
library(org.At.air.db)
x <- org.At.tairENTREZID
# Get the ORF IDs that are mapped to an Entrez Gene ID
mapped_genes <- mappedkeys(x)
# Convert to a list
xx <- as.list(x[mapped_genes])
Then you have a list Entrez gene ids with TAIR ids as the names of the list elements.
HTH.
Lei
——————
Lei Huang, Ph.D.
Bioinformatician
Center for Research Informatics
University of Chicago
Email: lhuang7 at uchicago.edu
On Jun 9, 2015, at 4:09 PM, Gilbert Feng <gilbertfeng@gmail.com> wrote:
Hi, Fuyou
categoryType should not be NULL if you specify a org.xx.xxx.db. I can't remember GO.db or KEGG.db supports tair ids though I once used GeneAnswes to analyze arabidopsis based on GO.db. Even in the worst case, which GO.db and KEGG.db don't support tair ids directly, you can still build a GeneAnswers object by customized way. In the vignette, there is a section names as "build customized GeneAnswers". You can find the codes in "code
chunk number 3" at http://www.bioconductor.org/packages/release/bioc/vignettes/GeneAnswers/inst/doc/geneAnswers.R
The idea is that you can retrieve all of GO terms based on your tair gene ids, then retrieve all of tair gene ids for each GO id. Now this GO-tairID list is the "entrezIDList"
in the vignette. Then you can do the following analysis by GeneAnswers. Maybe, you have to manually reset "annLib" and "categoryType",
then you can draw the network or heatmap with tair ids.
Hopefully, this is helpful!
Gilbert
> require(org.At.tair.db)
Loading required package: org.At.tair.db
> org.At.tairMAPCOUNTS
org.At.tairARACYC org.At.tairARACYCENZYME org.At.tairCHR org.At.tairCHRLENGTHS org.At.tairCHRLOC org.At.tairCHRLOCEND
516 3113 27416 7 27205 27205
org.At.tairENZYME org.At.tairENZYME2TAIR org.At.tairGENENAME org.At.tairGO org.At.tairGO2ALLTAIRS org.At.tairGO2TAIR
2121 723 7804 27103 7767 5290
org.At.tairPATH org.At.tairPATH2TAIR org.At.tairPMID org.At.tairPMID2TAIR org.At.tairREFSEQ org.At.tairREFSEQ2TAIR
3450 123 22530 22060 27205 70352
org.At.tairSYMBOL
10667
On Tue, Jun 9, 2015 at 3:39 PM, Fu, Fuyou <fu115@purdue.edu> wrote:
Dear Dr. Feng,
I am using your R script, GeneAnswer for GO and KEGG pathway analysis. I met a problem about how to transfer Arabidopsis gene id to EntrzGeneID. I donot know if your script work on Arabidopsis. I have tried your example data from human genome. It is work well.
I use my Arabidopsis data that the error file is following as:
[1] "geneInput has built in ..."
Error in geneAnswersBuilder(sdgGeneInput, "org.At.tair.db", categoryType = NULL, :
Annotation library can not be recognized! Abort GeneAnswers Building ...
If I want to use Arabidopsis GO analysis, How I should change my data set?
Thank you very much,
Fuyou
Botany and Plant Pahtology Department
Purdue University