Search
Question: Ensembl identifiers mapping to Gene Symbol
0
2.0 years ago by
maedakus10
maedakus10 wrote:

Hi,

nice to meet to you,

now i am planning to convert ENSG id into Gene Symbol based.

would you tell me how to do it ?

modified 2.0 years ago • written 2.0 years ago by maedakus10
0
2.0 years ago by
Assa Yeroslaviz1.4k
Munich, Germany
Assa Yeroslaviz1.4k wrote:

biomaRt is easy to use. either the R package or the web site from Ensembl.

0
2.0 years ago by
Antwerp, Belgium
WouterDeCoster110 wrote:

What have you tried? There must be dozens of questions like this.

As Assa wrote, indeed, BiomaRt is easy to use. I see you tagged org.hs.eg.db which is also rather straightforward.

If you show what you tried and what didn't work people can help you more specifically, you can't just ask for an entire explanation without showing some work from your own.

dear sir,

thank you so much for answer,

what i have tried is as below. i have used bioconductor package "org.Hs.eg.db"

my script;

tsquamous$symbol <- mapIds(org.Hs.eg.db, keys=row.names(tsquamous), column="SYMBOL", keytype="ENSEMBL", multiVals="first") Warning message: In tsquamous$symbol <- mapIds(org.Hs.eg.db, keys = row.names(tsquamous),  :
Coercing LHS to a list

but unfortunately, most of ENSG code is not matched well like below. below is the results of above code.

many "NA" remain.

ENSG00000226051          ENSG00000226053          ENSG00000226067
"ZNF503-AS1"              "LOC729987"                       NA
ENSG00000226085          ENSG00000226091          ENSG00000226121
NA              "LINC00937"                       NA
ENSG00000226137          ENSG00000226194          ENSG00000226200
"BAIAP2-AS1"                       NA                       NA
ENSG00000226210          ENSG00000226232          ENSG00000226259
"WASH7P"                       NA                       NA
ENSG00000226287          ENSG00000226312          ENSG00000226314
"TMEM191A"              "CFLAR-AS1"               "ZNF192P1"

ICGC_0009 ICGC_0021 ICGC_0025 ICGC_0037 ICGC_0054 ICGC_0067
ENSG00000000003  5.375416  5.411612  5.620586  4.902278  4.741611  4.707213
ENSG00000000419  4.587680  4.386663  4.806422  5.209093  4.767072  4.624588
ENSG00000000457  3.638524  3.495744  3.551271  3.879609  4.093864  3.745619
ENSG00000000460  3.214169  3.198682  4.087306  3.256023  3.143017  2.423337

Since you got some results it seems your code is valid. I did a quick check of some genes which yielded "NA", and it appears those are pseudogenes. Likely these were not included in org.Hs.eg.db. You'll probably be able to solve this using BioMart, which is available online from Ensembl (http://www.ensembl.org/biomart/martview/) but also exists as an R package in case you want to automate the conversion (https://bioconductor.org/packages/release/bioc/html/biomaRt.html). For sure the website is straightforward to use.