Question: using BioMart to query UniProt identifiers
0
gravatar for Wolfgang RAFFELSBERGER
8.4 years ago by
Wolfgang RAFFELSBERGER130 wrote:
Dear list, Context : I'd like to calculate GO enrichments for a list of UniProt identifiers (note that they are "ID" or "Entry name" and NOT "AC" or "Accession"). So I tried to use BioMart to extract the GO-IDs for my list of UniProt identifiers, see code below. Basically after calling getBM() R doesn't return the command-line any more for more than 5 minutes. I tested this on Linux and Windows -> both same problem, so I suppose either I might be doing wrong or something isn't working right. Any hints ? Thank's in advance, Wolfgang Raffelsberger ## the code .. require(annotate) require(biomaRt) IDs <- c("MTMR1_HUMAN","MTMR2_HUMAN","MTMR3_HUMAN","MTMR4_HUMAN") ## existing UniProt IDs uniProt <- useMart("unimart") listAttributes(useDataset("uniprot",mart=uniProt)) ## contains "name" and "go_id" GO_IDs <- getBM(attributes =c("name","go_id"),values=IDs, mart=useDataset("uniprot",mart=uniProt)) ## after >5 minutes the command-line is still not returned ... ## for completeness : sessionInfo() R version 2.12.2 (2011-02-25) Platform: i386-pc-mingw32/i386 (32-bit) locale: [1] LC_COLLATE=French_France.1252 LC_CTYPE=French_France.1252 [3] LC_MONETARY=French_France.1252 LC_NUMERIC=C [5] LC_TIME=French_France.1252 attached base packages: [1] grDevices datasets splines graphics stats tcltk utils [8] methods base other attached packages: [1] biomaRt_2.6.0 annotate_1.28.0 AnnotationDbi_1.12.0 [4] Biobase_2.10.0 svSocket_0.9-51 TinnR_1.0.3 [7] R2HTML_2.2 Hmisc_3.8-3 survival_2.36-5 loaded via a namespace (and not attached): [1] cluster_1.13.3 DBI_0.2-5 grid_2.12.2 lattice_0.19-17 [5] RCurl_1.4-2.1 RSQLite_0.9-4 svMisc_0.9-61 tools_2.12.2 [9] XML_3.1-0.1 xtable_1.5-6 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Wolfgang Raffelsberger, PhD IGBMC, 1 rue Laurent Fries, 67404 Illkirch Strasbourg, France Tel (+33) 388 65 3300 Fax (+33) 388 65 3276 wolfgang.raffelsberger (at) igbmc.fr [[alternative HTML version deleted]]
go biomart • 2.5k views
ADD COMMENTlink modified 8.4 years ago by Steffen Durinck540 • written 8.4 years ago by Wolfgang RAFFELSBERGER130
Answer: using BioMart to query UniProt identifiers
0
gravatar for Steffen Durinck
8.4 years ago by
Steffen Durinck540 wrote:
Hi Wolfgang, There are a few issues: 1) You're missing a filter attribute in your getBM query. This will result in you querying for GO ids of everything that is in uniprot and that is probably why it is taking so long. If you do the following commands it should be fast: uniProt <- useMart("unimart", dataset="uniprot") IDs <- c("MTMR1_HUMAN","MTMR2_HUMAN","MTMR3_HUMAN","MTMR4_HUMAN") GO_IDs <- getBM(attributes =c("name","go_id"),filter="accession",values=IDs ,mart=uniProt) 2) You'll notice that you don't get anything back. You'll either need to give it an accession number (for MTMR1 this is Q13613) and use the accession filter name or give it a gene name e.g. MTMR1 and use the gene_name filter. e.g.: getBM(attributes =c("name","go_id"),filter="gene_name",values="MTMR1" ,mart=uniProt) or getBM(attributes =c("name","go_id"),filter="accession",values="Q13613" ,mart=uniProt) Cheers, Steffen On Wed, Apr 6, 2011 at 8:50 AM, Wolfgang RAFFELSBERGER <wraff at="" igbmc.fr=""> wrote: > Dear list, > > Context : I'd like to calculate GO enrichments for a list of UniProt identifiers (note that they are "ID" or "Entry name" and NOT "AC" or "Accession"). > So I tried to use BioMart to extract the GO-IDs for my list of UniProt identifiers, see code below. > Basically after calling getBM() R doesn't return the command-line any more for more than 5 minutes. I tested this on Linux and Windows -> both same problem, so I suppose either I might be doing wrong or something isn't working right. > > Any hints ?? > > Thank's in advance, > Wolfgang Raffelsberger > > > ## the code .. > ?require(annotate) > ?require(biomaRt) > > ?IDs <- c("MTMR1_HUMAN","MTMR2_HUMAN","MTMR3_HUMAN","MTMR4_HUMAN") ?## ?existing UniProt IDs > > ?uniProt <- useMart("unimart") > ?listAttributes(useDataset("uniprot",mart=uniProt)) ? ## contains "name" and "go_id" > ?GO_IDs <- getBM(attributes =c("name","go_id"),values=IDs, mart=useDataset("uniprot",mart=uniProt)) > ## after >5 minutes the command-line is still not returned ... > > > ## for completeness : > ?sessionInfo() > > R version 2.12.2 (2011-02-25) > Platform: i386-pc-mingw32/i386 (32-bit) > > locale: > [1] LC_COLLATE=French_France.1252 ?LC_CTYPE=French_France.1252 > [3] LC_MONETARY=French_France.1252 LC_NUMERIC=C > [5] LC_TIME=French_France.1252 > > attached base packages: > [1] grDevices datasets ?splines ? graphics ?stats ? ? tcltk ? ? utils > [8] methods ? base > > other attached packages: > [1] biomaRt_2.6.0 ? ? ? ?annotate_1.28.0 ? ? ?AnnotationDbi_1.12.0 > [4] Biobase_2.10.0 ? ? ? svSocket_0.9-51 ? ? ?TinnR_1.0.3 > [7] R2HTML_2.2 ? ? ? ? ? Hmisc_3.8-3 ? ? ? ? ?survival_2.36-5 > > loaded via a namespace (and not attached): > ?[1] cluster_1.13.3 ?DBI_0.2-5 ? ? ? grid_2.12.2 ? ? lattice_0.19-17 > ?[5] RCurl_1.4-2.1 ? RSQLite_0.9-4 ? svMisc_0.9-61 ? tools_2.12.2 > ?[9] XML_3.1-0.1 ? ? xtable_1.5-6 > > > > . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . > Wolfgang Raffelsberger, PhD > IGBMC, > 1 rue Laurent Fries, ?67404 Illkirch ?Strasbourg, ?France > Tel (+33) 388 65 3300 ? ? ? ? Fax (+33) 388 65 3276 > wolfgang.raffelsberger (at) igbmc.fr > > ? ? ? ?[[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor >
ADD COMMENTlink written 8.4 years ago by Steffen Durinck540
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 158 users visited in the last hour