Converting EnSeMBL Probe names into Gene Name

0

Entering edit mode

Gundala Viswanath ▴ 230

@gundala-viswanath-2872

Last seen 11.3 years ago

Dear all, Is there a way with Bioconductor in which I can convert such EnSemBL probe names into the standard gene names? AFFX-M27830_5_at AFFX-M27830_M_at ENSG00000000003_at ENSG00000000005_at ENSG00000000419_at - Gundala Viswanath Jakarta - Indonesia

probe probe • 3.1k views

ADD COMMENT • link updated 17.3 years ago by Sean Davis 21k • written 17.3 years ago by Gundala Viswanath ▴ 230

0

Entering edit mode

Sean Davis 21k

@sean-davis-490

Last seen 10 months ago

United States

On Thu, Sep 18, 2008 at 4:37 AM, Gundala Viswanath <gundalav at="" gmail.com=""> wrote: > Dear all, > > Is there a way with Bioconductor in which I can > convert such EnSemBL probe names into the > standard gene names? > > AFFX-M27830_5_at > AFFX-M27830_M_at > ENSG00000000003_at > ENSG00000000005_at > ENSG00000000419_at Hi, Gundala. In general, you do not need to cross-post to both bioconductor and R lists. These are not standard Ensembl names. You could strip off the "_at" and some of them would become Ensembl gene names (the ones that begin with ENSG; the others look like affy control probes). Then, you could use biomart to get information about them. See the biomart vignette and help pages for assistance. Sean

ADD COMMENT • link 17.3 years ago Sean Davis 21k

0

Entering edit mode

Another alternative is to use the org.Hs.eg.db package > library(org.Hs.eg.db) Loading required package: AnnotationDbi Loading required package: Biobase Loading required package: tools Welcome to Bioconductor Vignettes contain introductory material. To view, type 'openVignette()'. To cite Bioconductor, see 'citation("Biobase")' and for packages 'citation(pkgname)'. Loading required package: DBI Loading required package: RSQLite > ens <- c("ENSG00000000003","ENSG00000000005","ENSG00000000419") > egs <- mget(ens, revmap(org.Hs.egENSEMBL)) > egs $ENSG00000000003 [1] "7105" $ENSG00000000005 [1] "64102" $ENSG00000000419 [1] "8813" > gns <- mget(unlist(egs), org.Hs.egSYMBOL) > gns $`7105` [1] "TSPAN6" $`64102` [1] "TNMD" $`8813` [1] "DPM1" Since most BioC annotation packages are Entrez Gene-centric, you will need to map via the Entrez Gene ID, whereas you can do the direct mapping using biomaRt. Best, Jim Sean Davis wrote: > On Thu, Sep 18, 2008 at 4:37 AM, Gundala Viswanath <gundalav at="" gmail.com=""> wrote: >> Dear all, >> >> Is there a way with Bioconductor in which I can >> convert such EnSemBL probe names into the >> standard gene names? >> >> AFFX-M27830_5_at >> AFFX-M27830_M_at >> ENSG00000000003_at >> ENSG00000000005_at >> ENSG00000000419_at > > Hi, Gundala. In general, you do not need to cross-post to both > bioconductor and R lists. > > These are not standard Ensembl names. You could strip off the "_at" > and some of them would become Ensembl gene names (the ones that begin > with ENSG; the others look like affy control probes). Then, you could > use biomart to get information about them. See the biomart vignette > and help pages for assistance. > > Sean > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor -- James W. MacDonald, M.S. Biostatistician Hildebrandt Lab 8220D MSRB III 1150 W. Medical Center Drive Ann Arbor MI 48109-0646 734-936-8662

ADD REPLY • link 17.3 years ago James W. MacDonald 68k

0

Entering edit mode

Hi Jim (and others), Since this topic is of interest to me as well, do you have any pointer how to construct an 'org.Hs.xx.db' library based on ENSEMBL IDs using the direct mappings from BiomaRt? In other words; I do know how to map ENSEMBL IDs to gene symbol, name, GO class etc using biomart, but I would like to 'merge' these separate files such way to get a new-style annotation db package based on ENSEMBL IDs (thus avoiding the use of intermediate Entrez IDs). Or is this per definition an impossible task? Thanks, Guido ------------------------------------------------ Guido Hooiveld, PhD Nutrition, Metabolism & Genomics Group Division of Human Nutrition Wageningen University Biotechnion, Bomenweg 2 NL-6703 HD Wageningen the Netherlands tel: (+)31 317 485788 fax: (+)31 317 483342 internet: http://nutrigene.4t.com email: guido.hooiveld at wur.nl > -----Original Message----- > From: bioconductor-bounces at stat.math.ethz.ch > [mailto:bioconductor-bounces at stat.math.ethz.ch] On Behalf Of > James W. MacDonald > Sent: 18 September 2008 14:24 > To: Gundala Viswanath > Cc: bioconductor at stat.math.ethz.ch > Subject: Re: [BioC] Converting EnSeMBL Probe names into Gene Name > > Another alternative is to use the org.Hs.eg.db package > > > library(org.Hs.eg.db) > Loading required package: AnnotationDbi > Loading required package: Biobase > Loading required package: tools > > Welcome to Bioconductor > > Vignettes contain introductory material. To view, type > 'openVignette()'. To cite Bioconductor, see > 'citation("Biobase")' and for packages 'citation(pkgname)'. > > Loading required package: DBI > Loading required package: RSQLite > > ens <- c("ENSG00000000003","ENSG00000000005","ENSG00000000419") > > egs <- mget(ens, revmap(org.Hs.egENSEMBL)) > egs > $ENSG00000000003 > [1] "7105" > > $ENSG00000000005 > [1] "64102" > > $ENSG00000000419 > [1] "8813" > > > gns <- mget(unlist(egs), org.Hs.egSYMBOL) > gns $`7105` > [1] "TSPAN6" > > $`64102` > [1] "TNMD" > > $`8813` > [1] "DPM1" > > Since most BioC annotation packages are Entrez Gene-centric, > you will need to map via the Entrez Gene ID, whereas you can > do the direct mapping using biomaRt. > > Best, > > Jim > > Sean Davis wrote: > > On Thu, Sep 18, 2008 at 4:37 AM, Gundala Viswanath > <gundalav at="" gmail.com=""> wrote: > >> Dear all, > >> > >> Is there a way with Bioconductor in which I can convert > such EnSemBL > >> probe names into the standard gene names? > >> > >> AFFX-M27830_5_at > >> AFFX-M27830_M_at > >> ENSG00000000003_at > >> ENSG00000000005_at > >> ENSG00000000419_at > > > > Hi, Gundala. In general, you do not need to cross-post to both > > bioconductor and R lists. > > > > These are not standard Ensembl names. You could strip off the "_at" > > and some of them would become Ensembl gene names (the ones > that begin > > with ENSG; the others look like affy control probes). > Then, you could > > use biomart to get information about them. See the biomart > vignette > > and help pages for assistance. > > > > Sean > > > > _______________________________________________ > > Bioconductor mailing list > > Bioconductor at stat.math.ethz.ch > > https://stat.ethz.ch/mailman/listinfo/bioconductor > > Search the archives: > > http://news.gmane.org/gmane.science.biology.informatics.conductor > > -- > James W. MacDonald, M.S. > Biostatistician > Hildebrandt Lab > 8220D MSRB III > 1150 W. Medical Center Drive > Ann Arbor MI 48109-0646 > 734-936-8662 > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > >

ADD REPLY • link 17.3 years ago Guido Hooiveld ★ 4.1k

0

Entering edit mode

On Fri, Sep 19, 2008 at 10:38 AM, Hooiveld, Guido <guido.hooiveld at="" wur.nl=""> wrote: > Hi Jim (and others), > > Since this topic is of interest to me as well, do you have any pointer > how to construct an 'org.Hs.xx.db' library based on ENSEMBL IDs using > the direct mappings from BiomaRt? > In other words; I do know how to map ENSEMBL IDs to gene symbol, name, > GO class etc using biomart, but I would like to 'merge' these separate > files such way to get a new-style annotation db package based on ENSEMBL > IDs (thus avoiding the use of intermediate Entrez IDs). Or is this per > definition an impossible task? See the SQLForge documentation in the AnnotationDBI. You can use the list of ensembl IDs and their corresponding Entrez Gene IDs to construct a new annotation db package. Alternatively, you could get the Ensembl-Entrez-gene relationship using biomart. The final products will be similar, but probably not identical. Also, keep in mind that the actual data in the org.db packages are based on NCBI annotation even though the key would be an ensembl ID. With all that said, the simpler way to go is to simply convert your entire list to entrez gene id using either the org.Hs mappings or biomart and then proceed with the Entrez gene ID as the key. Sean > Thanks, > Guido > > ------------------------------------------------ > Guido Hooiveld, PhD > Nutrition, Metabolism & Genomics Group > Division of Human Nutrition > Wageningen University > Biotechnion, Bomenweg 2 > NL-6703 HD Wageningen > the Netherlands > tel: (+)31 317 485788 > fax: (+)31 317 483342 > internet: http://nutrigene.4t.com > email: guido.hooiveld at wur.nl > > > >> -----Original Message----- >> From: bioconductor-bounces at stat.math.ethz.ch >> [mailto:bioconductor-bounces at stat.math.ethz.ch] On Behalf Of >> James W. MacDonald >> Sent: 18 September 2008 14:24 >> To: Gundala Viswanath >> Cc: bioconductor at stat.math.ethz.ch >> Subject: Re: [BioC] Converting EnSeMBL Probe names into Gene Name >> >> Another alternative is to use the org.Hs.eg.db package >> >> > library(org.Hs.eg.db) >> Loading required package: AnnotationDbi >> Loading required package: Biobase >> Loading required package: tools >> >> Welcome to Bioconductor >> >> Vignettes contain introductory material. To view, type >> 'openVignette()'. To cite Bioconductor, see >> 'citation("Biobase")' and for packages 'citation(pkgname)'. >> >> Loading required package: DBI >> Loading required package: RSQLite >> > ens <- c("ENSG00000000003","ENSG00000000005","ENSG00000000419") >> > egs <- mget(ens, revmap(org.Hs.egENSEMBL)) > egs >> $ENSG00000000003 >> [1] "7105" >> >> $ENSG00000000005 >> [1] "64102" >> >> $ENSG00000000419 >> [1] "8813" >> >> > gns <- mget(unlist(egs), org.Hs.egSYMBOL) > gns $`7105` >> [1] "TSPAN6" >> >> $`64102` >> [1] "TNMD" >> >> $`8813` >> [1] "DPM1" >> >> Since most BioC annotation packages are Entrez Gene-centric, >> you will need to map via the Entrez Gene ID, whereas you can >> do the direct mapping using biomaRt. >> >> Best, >> >> Jim >> >> Sean Davis wrote: >> > On Thu, Sep 18, 2008 at 4:37 AM, Gundala Viswanath >> <gundalav at="" gmail.com=""> wrote: >> >> Dear all, >> >> >> >> Is there a way with Bioconductor in which I can convert >> such EnSemBL >> >> probe names into the standard gene names? >> >> >> >> AFFX-M27830_5_at >> >> AFFX-M27830_M_at >> >> ENSG00000000003_at >> >> ENSG00000000005_at >> >> ENSG00000000419_at >> > >> > Hi, Gundala. In general, you do not need to cross-post to both >> > bioconductor and R lists. >> > >> > These are not standard Ensembl names. You could strip off the "_at" >> > and some of them would become Ensembl gene names (the ones >> that begin >> > with ENSG; the others look like affy control probes). >> Then, you could >> > use biomart to get information about them. See the biomart >> vignette >> > and help pages for assistance. >> > >> > Sean >> > >> > _______________________________________________ >> > Bioconductor mailing list >> > Bioconductor at stat.math.ethz.ch >> > https://stat.ethz.ch/mailman/listinfo/bioconductor >> > Search the archives: >> > http://news.gmane.org/gmane.science.biology.informatics.conductor >> >> -- >> James W. MacDonald, M.S. >> Biostatistician >> Hildebrandt Lab >> 8220D MSRB III >> 1150 W. Medical Center Drive >> Ann Arbor MI 48109-0646 >> 734-936-8662 >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at stat.math.ethz.ch >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: >> http://news.gmane.org/gmane.science.biology.informatics.conductor >> >> > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor >

ADD REPLY • link 17.3 years ago Sean Davis 21k

Login before adding your answer.