Converting EnSeMBL Probe names into Gene Name
1
0
Entering edit mode
@gundala-viswanath-2872
Last seen 9.7 years ago
Dear all, Is there a way with Bioconductor in which I can convert such EnSemBL probe names into the standard gene names? AFFX-M27830_5_at AFFX-M27830_M_at ENSG00000000003_at ENSG00000000005_at ENSG00000000419_at - Gundala Viswanath Jakarta - Indonesia
probe probe • 2.7k views
ADD COMMENT
0
Entering edit mode
@sean-davis-490
Last seen 3 months ago
United States
On Thu, Sep 18, 2008 at 4:37 AM, Gundala Viswanath <gundalav at="" gmail.com=""> wrote: > Dear all, > > Is there a way with Bioconductor in which I can > convert such EnSemBL probe names into the > standard gene names? > > AFFX-M27830_5_at > AFFX-M27830_M_at > ENSG00000000003_at > ENSG00000000005_at > ENSG00000000419_at Hi, Gundala. In general, you do not need to cross-post to both bioconductor and R lists. These are not standard Ensembl names. You could strip off the "_at" and some of them would become Ensembl gene names (the ones that begin with ENSG; the others look like affy control probes). Then, you could use biomart to get information about them. See the biomart vignette and help pages for assistance. Sean
ADD COMMENT
0
Entering edit mode
Another alternative is to use the org.Hs.eg.db package > library(org.Hs.eg.db) Loading required package: AnnotationDbi Loading required package: Biobase Loading required package: tools Welcome to Bioconductor Vignettes contain introductory material. To view, type 'openVignette()'. To cite Bioconductor, see 'citation("Biobase")' and for packages 'citation(pkgname)'. Loading required package: DBI Loading required package: RSQLite > ens <- c("ENSG00000000003","ENSG00000000005","ENSG00000000419") > egs <- mget(ens, revmap(org.Hs.egENSEMBL)) > egs $ENSG00000000003 [1] "7105" $ENSG00000000005 [1] "64102" $ENSG00000000419 [1] "8813" > gns <- mget(unlist(egs), org.Hs.egSYMBOL) > gns $`7105` [1] "TSPAN6" $`64102` [1] "TNMD" $`8813` [1] "DPM1" Since most BioC annotation packages are Entrez Gene-centric, you will need to map via the Entrez Gene ID, whereas you can do the direct mapping using biomaRt. Best, Jim Sean Davis wrote: > On Thu, Sep 18, 2008 at 4:37 AM, Gundala Viswanath <gundalav at="" gmail.com=""> wrote: >> Dear all, >> >> Is there a way with Bioconductor in which I can >> convert such EnSemBL probe names into the >> standard gene names? >> >> AFFX-M27830_5_at >> AFFX-M27830_M_at >> ENSG00000000003_at >> ENSG00000000005_at >> ENSG00000000419_at > > Hi, Gundala. In general, you do not need to cross-post to both > bioconductor and R lists. > > These are not standard Ensembl names. You could strip off the "_at" > and some of them would become Ensembl gene names (the ones that begin > with ENSG; the others look like affy control probes). Then, you could > use biomart to get information about them. See the biomart vignette > and help pages for assistance. > > Sean > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor -- James W. MacDonald, M.S. Biostatistician Hildebrandt Lab 8220D MSRB III 1150 W. Medical Center Drive Ann Arbor MI 48109-0646 734-936-8662
ADD REPLY
0
Entering edit mode
Hi Jim (and others), Since this topic is of interest to me as well, do you have any pointer how to construct an 'org.Hs.xx.db' library based on ENSEMBL IDs using the direct mappings from BiomaRt? In other words; I do know how to map ENSEMBL IDs to gene symbol, name, GO class etc using biomart, but I would like to 'merge' these separate files such way to get a new-style annotation db package based on ENSEMBL IDs (thus avoiding the use of intermediate Entrez IDs). Or is this per definition an impossible task? Thanks, Guido ------------------------------------------------ Guido Hooiveld, PhD Nutrition, Metabolism & Genomics Group Division of Human Nutrition Wageningen University Biotechnion, Bomenweg 2 NL-6703 HD Wageningen the Netherlands tel: (+)31 317 485788 fax: (+)31 317 483342 internet: http://nutrigene.4t.com email: guido.hooiveld at wur.nl > -----Original Message----- > From: bioconductor-bounces at stat.math.ethz.ch > [mailto:bioconductor-bounces at stat.math.ethz.ch] On Behalf Of > James W. MacDonald > Sent: 18 September 2008 14:24 > To: Gundala Viswanath > Cc: bioconductor at stat.math.ethz.ch > Subject: Re: [BioC] Converting EnSeMBL Probe names into Gene Name > > Another alternative is to use the org.Hs.eg.db package > > > library(org.Hs.eg.db) > Loading required package: AnnotationDbi > Loading required package: Biobase > Loading required package: tools > > Welcome to Bioconductor > > Vignettes contain introductory material. To view, type > 'openVignette()'. To cite Bioconductor, see > 'citation("Biobase")' and for packages 'citation(pkgname)'. > > Loading required package: DBI > Loading required package: RSQLite > > ens <- c("ENSG00000000003","ENSG00000000005","ENSG00000000419") > > egs <- mget(ens, revmap(org.Hs.egENSEMBL)) > egs > $ENSG00000000003 > [1] "7105" > > $ENSG00000000005 > [1] "64102" > > $ENSG00000000419 > [1] "8813" > > > gns <- mget(unlist(egs), org.Hs.egSYMBOL) > gns $`7105` > [1] "TSPAN6" > > $`64102` > [1] "TNMD" > > $`8813` > [1] "DPM1" > > Since most BioC annotation packages are Entrez Gene-centric, > you will need to map via the Entrez Gene ID, whereas you can > do the direct mapping using biomaRt. > > Best, > > Jim > > Sean Davis wrote: > > On Thu, Sep 18, 2008 at 4:37 AM, Gundala Viswanath > <gundalav at="" gmail.com=""> wrote: > >> Dear all, > >> > >> Is there a way with Bioconductor in which I can convert > such EnSemBL > >> probe names into the standard gene names? > >> > >> AFFX-M27830_5_at > >> AFFX-M27830_M_at > >> ENSG00000000003_at > >> ENSG00000000005_at > >> ENSG00000000419_at > > > > Hi, Gundala. In general, you do not need to cross-post to both > > bioconductor and R lists. > > > > These are not standard Ensembl names. You could strip off the "_at" > > and some of them would become Ensembl gene names (the ones > that begin > > with ENSG; the others look like affy control probes). > Then, you could > > use biomart to get information about them. See the biomart > vignette > > and help pages for assistance. > > > > Sean > > > > _______________________________________________ > > Bioconductor mailing list > > Bioconductor at stat.math.ethz.ch > > https://stat.ethz.ch/mailman/listinfo/bioconductor > > Search the archives: > > http://news.gmane.org/gmane.science.biology.informatics.conductor > > -- > James W. MacDonald, M.S. > Biostatistician > Hildebrandt Lab > 8220D MSRB III > 1150 W. Medical Center Drive > Ann Arbor MI 48109-0646 > 734-936-8662 > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > >
ADD REPLY
0
Entering edit mode
On Fri, Sep 19, 2008 at 10:38 AM, Hooiveld, Guido <guido.hooiveld at="" wur.nl=""> wrote: > Hi Jim (and others), > > Since this topic is of interest to me as well, do you have any pointer > how to construct an 'org.Hs.xx.db' library based on ENSEMBL IDs using > the direct mappings from BiomaRt? > In other words; I do know how to map ENSEMBL IDs to gene symbol, name, > GO class etc using biomart, but I would like to 'merge' these separate > files such way to get a new-style annotation db package based on ENSEMBL > IDs (thus avoiding the use of intermediate Entrez IDs). Or is this per > definition an impossible task? See the SQLForge documentation in the AnnotationDBI. You can use the list of ensembl IDs and their corresponding Entrez Gene IDs to construct a new annotation db package. Alternatively, you could get the Ensembl-Entrez-gene relationship using biomart. The final products will be similar, but probably not identical. Also, keep in mind that the actual data in the org.db packages are based on NCBI annotation even though the key would be an ensembl ID. With all that said, the simpler way to go is to simply convert your entire list to entrez gene id using either the org.Hs mappings or biomart and then proceed with the Entrez gene ID as the key. Sean > Thanks, > Guido > > ------------------------------------------------ > Guido Hooiveld, PhD > Nutrition, Metabolism & Genomics Group > Division of Human Nutrition > Wageningen University > Biotechnion, Bomenweg 2 > NL-6703 HD Wageningen > the Netherlands > tel: (+)31 317 485788 > fax: (+)31 317 483342 > internet: http://nutrigene.4t.com > email: guido.hooiveld at wur.nl > > > >> -----Original Message----- >> From: bioconductor-bounces at stat.math.ethz.ch >> [mailto:bioconductor-bounces at stat.math.ethz.ch] On Behalf Of >> James W. MacDonald >> Sent: 18 September 2008 14:24 >> To: Gundala Viswanath >> Cc: bioconductor at stat.math.ethz.ch >> Subject: Re: [BioC] Converting EnSeMBL Probe names into Gene Name >> >> Another alternative is to use the org.Hs.eg.db package >> >> > library(org.Hs.eg.db) >> Loading required package: AnnotationDbi >> Loading required package: Biobase >> Loading required package: tools >> >> Welcome to Bioconductor >> >> Vignettes contain introductory material. To view, type >> 'openVignette()'. To cite Bioconductor, see >> 'citation("Biobase")' and for packages 'citation(pkgname)'. >> >> Loading required package: DBI >> Loading required package: RSQLite >> > ens <- c("ENSG00000000003","ENSG00000000005","ENSG00000000419") >> > egs <- mget(ens, revmap(org.Hs.egENSEMBL)) > egs >> $ENSG00000000003 >> [1] "7105" >> >> $ENSG00000000005 >> [1] "64102" >> >> $ENSG00000000419 >> [1] "8813" >> >> > gns <- mget(unlist(egs), org.Hs.egSYMBOL) > gns $`7105` >> [1] "TSPAN6" >> >> $`64102` >> [1] "TNMD" >> >> $`8813` >> [1] "DPM1" >> >> Since most BioC annotation packages are Entrez Gene-centric, >> you will need to map via the Entrez Gene ID, whereas you can >> do the direct mapping using biomaRt. >> >> Best, >> >> Jim >> >> Sean Davis wrote: >> > On Thu, Sep 18, 2008 at 4:37 AM, Gundala Viswanath >> <gundalav at="" gmail.com=""> wrote: >> >> Dear all, >> >> >> >> Is there a way with Bioconductor in which I can convert >> such EnSemBL >> >> probe names into the standard gene names? >> >> >> >> AFFX-M27830_5_at >> >> AFFX-M27830_M_at >> >> ENSG00000000003_at >> >> ENSG00000000005_at >> >> ENSG00000000419_at >> > >> > Hi, Gundala. In general, you do not need to cross-post to both >> > bioconductor and R lists. >> > >> > These are not standard Ensembl names. You could strip off the "_at" >> > and some of them would become Ensembl gene names (the ones >> that begin >> > with ENSG; the others look like affy control probes). >> Then, you could >> > use biomart to get information about them. See the biomart >> vignette >> > and help pages for assistance. >> > >> > Sean >> > >> > _______________________________________________ >> > Bioconductor mailing list >> > Bioconductor at stat.math.ethz.ch >> > https://stat.ethz.ch/mailman/listinfo/bioconductor >> > Search the archives: >> > http://news.gmane.org/gmane.science.biology.informatics.conductor >> >> -- >> James W. MacDonald, M.S. >> Biostatistician >> Hildebrandt Lab >> 8220D MSRB III >> 1150 W. Medical Center Drive >> Ann Arbor MI 48109-0646 >> 734-936-8662 >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at stat.math.ethz.ch >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: >> http://news.gmane.org/gmane.science.biology.informatics.conductor >> >> > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor >
ADD REPLY

Login before adding your answer.

Traffic: 422 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6