Retrieving all entrez identifiers that are annotated in KEGG pathways
1
0
Entering edit mode
@james-w-macdonald-5106
Last seen 5 hours ago
United States
Hi Anirban, > library(hgu133plus2.db) > x <- select(hgu133plus2.db, Lkeys(hgu133plus2PATH), c("ENTREZID","PATH")) Warning message: In .generateExtraRows(tab, keys, jointype) : 'select' resulted in 1:many mapping between keys and return rows > head(x) PROBEID ENTREZID PATH 1 1007_s_at 780 <na> 2 1053_at 5982 03030 3 1053_at 5982 03420 > egids <- unique(x$ENTREZID[!is.na(x$PATH)]) > length(egids) [1] 5498 Best, Jim On 3/2/2013 8:20 AM, Anirban [guest] wrote: > Dear all, > > Is there any way to get all entrez identifiers that are annotated with KEGG pathways? Actually I am using GOStats package in R to perform KEGG pathway enrichment analysis.. In general, for each KEGG pathway term there is a list of annotated hgnc symbols or entrez identifiers.. For all KEGG pathway terms we must have one list of entrez identifiers. I want to have that list... > > What I am doing write now is as follows: > library(biomaRt) > library("GO.db") > library("KEGG.db") > library("GOstats") > library("hgu133plus2.db") > library("EMA") > library("fdrtool") > library("org.Hs.eg.db") > ensembl = useMart("ensembl", dataset = "hsapiens_gene_ensembl") > x<- hgu133plus2PATH > mapped_probes<- mappedkeys(x) > > b<-getBM(attributes=c("hgnc_symbol"),filters="affy_hg_u133_plus_2",v alues=mapped_probes,mart=ensembl) > > Is it the correct way to do that? > > Thanks in advance.. :) > > -- output of sessionInfo(): > > R version 2.15.1 (2012-06-22) > Platform: x86_64-pc-linux-gnu (64-bit) > > locale: > [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C > [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 > [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 > [7] LC_PAPER=C LC_NAME=C > [9] LC_ADDRESS=C LC_TELEPHONE=C > [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C > > attached base packages: > [1] stats graphics grDevices utils datasets methods base > > -- > Sent via the guest posting facility at bioconductor.org. > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor -- James W. MacDonald, M.S. Biostatistician University of Washington Environmental and Occupational Health Sciences 4225 Roosevelt Way NE, # 100 Seattle WA 98105-6099
GOstats GOstats • 1.1k views
ADD COMMENT
0
Entering edit mode
ANIRBAN BHAR ▴ 60
@anirban-bhar-4836
Last seen 9.6 years ago
Hi James, Thanks a lot.. Best regards, Anirban On Sat, Mar 2, 2013 at 7:41 PM, James W. MacDonald <jmacdon@uw.edu> wrote: > Hi Anirban, > > > library(hgu133plus2.db) > > x <- select(hgu133plus2.db, Lkeys(hgu133plus2PATH), c("ENTREZID","PATH")) > Warning message: > In .generateExtraRows(tab, keys, jointype) : > 'select' resulted in 1:many mapping between keys and return rows > > head(x) > PROBEID ENTREZID PATH > 1 1007_s_at 780 <na> > 2 1053_at 5982 03030 > 3 1053_at 5982 03420 > > > egids <- unique(x$ENTREZID[!is.na(x$**PATH)]) > > length(egids) > [1] 5498 > > > Best, > > Jim > > > > > > On 3/2/2013 8:20 AM, Anirban [guest] wrote: > >> Dear all, >> >> Is there any way to get all entrez identifiers that are annotated with >> KEGG pathways? Actually I am using GOStats package in R to perform KEGG >> pathway enrichment analysis.. In general, for each KEGG pathway term there >> is a list of annotated hgnc symbols or entrez identifiers.. For all KEGG >> pathway terms we must have one list of entrez identifiers. I want to have >> that list... >> >> What I am doing write now is as follows: >> library(biomaRt) >> library("GO.db") >> library("KEGG.db") >> library("GOstats") >> library("hgu133plus2.db") >> library("EMA") >> library("fdrtool") >> library("org.Hs.eg.db") >> ensembl = useMart("ensembl", dataset = "hsapiens_gene_ensembl") >> x<- hgu133plus2PATH >> mapped_probes<- mappedkeys(x) >> >> b<-getBM(attributes=c("hgnc_**symbol"),filters="affy_hg_** >> u133_plus_2",values=mapped_**probes,mart=ensembl) >> >> Is it the correct way to do that? >> >> Thanks in advance.. :) >> >> -- output of sessionInfo(): >> >> R version 2.15.1 (2012-06-22) >> Platform: x86_64-pc-linux-gnu (64-bit) >> >> locale: >> [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C >> [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 >> [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 >> [7] LC_PAPER=C LC_NAME=C >> [9] LC_ADDRESS=C LC_TELEPHONE=C >> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C >> >> attached base packages: >> [1] stats graphics grDevices utils datasets methods base >> >> -- >> Sent via the guest posting facility at bioconductor.org. >> >> ______________________________**_________________ >> Bioconductor mailing list >> Bioconductor@r-project.org >> https://stat.ethz.ch/mailman/**listinfo/bioconductor<https: stat.e="" thz.ch="" mailman="" listinfo="" bioconductor=""> >> Search the archives: http://news.gmane.org/gmane.** >> science.biology.informatics.**conductor<http: news.gmane.org="" gmane="" .science.biology.informatics.conductor=""> >> > > -- > James W. MacDonald, M.S. > Biostatistician > University of Washington > Environmental and Occupational Health Sciences > 4225 Roosevelt Way NE, # 100 > Seattle WA 98105-6099 > > -- ______________________________________ Anirban Bhar Department of Bioinformatics University of Goettingen Goettingen, Germany [[alternative HTML version deleted]]
ADD COMMENT

Login before adding your answer.

Traffic: 604 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6