Search
Question: Ensembl identifiers mapping to Gene Symbol
0
gravatar for maedakus
16 months ago by
maedakus10
maedakus10 wrote:

 Hi,

nice to meet to you,

now i am planning to convert ENSG id into Gene Symbol based.

would you tell me how to do it ?

regards, thanks in advance !

ADD COMMENTlink modified 16 months ago • written 16 months ago by maedakus10
0
gravatar for Assa Yeroslaviz
16 months ago by
Assa Yeroslaviz1.3k
Munich, Germany
Assa Yeroslaviz1.3k wrote:

biomaRt is easy to use. either the R package or the web site from Ensembl.

ADD COMMENTlink modified 16 months ago • written 16 months ago by Assa Yeroslaviz1.3k
0
gravatar for WouterDeCoster
16 months ago by
Antwerp, Belgium
WouterDeCoster100 wrote:

What have you tried? There must be dozens of questions like this.

As Assa wrote, indeed, BiomaRt is easy to use. I see you tagged org.hs.eg.db which is also rather straightforward.

If you show what you tried and what didn't work people can help you more specifically, you can't just ask for an entire explanation without showing some work from your own.

ADD COMMENTlink written 16 months ago by WouterDeCoster100

dear sir,

thank you so much for answer,

what i have tried is as below. i have used bioconductor package "org.Hs.eg.db"

my script;

tsquamous$symbol <- mapIds(org.Hs.eg.db,
                     keys=row.names(tsquamous),
                     column="SYMBOL",
                     keytype="ENSEMBL",
                     multiVals="first")

Warning message:
In tsquamous$symbol <- mapIds(org.Hs.eg.db, keys = row.names(tsquamous),  :
  Coercing LHS to a list

but unfortunately, most of ENSG code is not matched well like below. below is the results of above code.

many "NA" remain.

    ENSG00000226051          ENSG00000226053          ENSG00000226067
            "ZNF503-AS1"              "LOC729987"                       NA
         ENSG00000226085          ENSG00000226091          ENSG00000226121
                      NA              "LINC00937"                       NA
         ENSG00000226137          ENSG00000226194          ENSG00000226200
            "BAIAP2-AS1"                       NA                       NA
         ENSG00000226210          ENSG00000226232          ENSG00000226259
                "WASH7P"                       NA                       NA
         ENSG00000226287          ENSG00000226312          ENSG00000226314
              "TMEM191A"              "CFLAR-AS1"               "ZNF192P1"

 

head(tsquamous)
                ICGC_0009 ICGC_0021 ICGC_0025 ICGC_0037 ICGC_0054 ICGC_0067
ENSG00000000003  5.375416  5.411612  5.620586  4.902278  4.741611  4.707213
ENSG00000000419  4.587680  4.386663  4.806422  5.209093  4.767072  4.624588
ENSG00000000457  3.638524  3.495744  3.551271  3.879609  4.093864  3.745619
ENSG00000000460  3.214169  3.198682  4.087306  3.256023  3.143017  2.423337


 

 

 

ADD REPLYlink written 16 months ago by maedakus10

Since you got some results it seems your code is valid. I did a quick check of some genes which yielded "NA", and it appears those are pseudogenes. Likely these were not included in org.Hs.eg.db. You'll probably be able to solve this using BioMart, which is available online from Ensembl (http://www.ensembl.org/biomart/martview/) but also exists as an R package in case you want to automate the conversion (https://bioconductor.org/packages/release/bioc/html/biomaRt.html). For sure the website is straightforward to use.

ADD REPLYlink written 16 months ago by WouterDeCoster100

thank you so much

ADD REPLYlink written 16 months ago by maedakus10
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.2.0
Traffic: 224 users visited in the last hour