Question

get BM error

0

Entering edit mode

Fara • 0

@8e7ca809

Last seen 20 months ago

Japan

Since this is my first time posting a question, I'd appreciate it if you could answer politely.

I have a problem translating Ensembl ID to Entrez Gene ID. I've tried the script provided below, but although it runs without errors, it only returns "NA" in the entrezgene_id column. What might be the issue?

Thank you for your assistance.


ensembl = useMart(biomart="ensembl",dataset="gaculeatus_gene_ensembl") 

res <- getBM(attributes = c('ensembl_gene_id', "entrezgene_id"), 
                        filters = "ensembl_gene_id",
                        values = gene, 
                        mart = ensembl, useCache = FALSE)

ensembldb • 1.2k views

ADD COMMENT • link updated 21 months ago by James W. MacDonald 68k • written 21 months ago by Fara • 0

1

Entering edit mode

Can you provide an example of the Ensemble IDs you're trying to convert?

ADD REPLY • link 21 months ago Mike Smith ★ 6.6k

0

Entering edit mode

Thank you for the reply. The examples are these.

ENSGACG00000014473 ENSGACG00000015168 ENSGACG00000007529

ADD REPLY • link 21 months ago Fara • 0

score 1 · Answer 1 · 2024-04-24

1

Entering edit mode

James W. MacDonald 68k

@james-w-macdonald-5106

Last seen 2 hours ago

United States

The genes you present are orthologous mappings from D. rerio to other teleost fishes, so it is probably going to be difficult to map to NCBI Gene IDs. As an example, the first gene is etv5b, and if you search on NCBI for that, Stickleback isn't even listed. Or an even more directed search results in nothing.

My general rule is that you should never try to map between Ensembl and NCBI IDs unless absolutely necessary, because there are any number of reasons why what appears to be a simple mapping is not simple at all.

ADD COMMENT • link 21 months ago James W. MacDonald 68k

0

Entering edit mode

Thank you for your response.

The primary reason for translating Ensembl IDs into Entrez Gene IDs is to perform KEGG (Kyoto Encyclopedia of Genes and Genomes) enrichment analysis.

Could you please inform me if there are any methods available for conducting KEGG enrichment analysis using Ensembl IDs?

ADD REPLY • link 21 months ago Fara • 0

0

Entering edit mode

I believe you need NCBI Gene IDs for KEGG, in which case you may need to map. The three genes you have shown here don't map, and of those three, all appear to be either orthologs of D. rerio or H. sapiens. As an example,

> library(AnnotationHub)
> hub <- AnnotationHub()
> zz <-  hub[["AH116275"]]
downloading 1 resources
retrieving 1 resource
  |===========================| 100%

loading from cache
require("ensembldb")
Warning message:
package 'GenomeInfoDb' was built under R version 4.3.2 
> zz
EnsDb for Ensembl:
|Backend: SQLite
|Db type: EnsDb
|Type of Gene ID: Ensembl Gene ID
|Supporting package: ensembldb
|Db created by: ensembldb package from Bioconductor
|script_version: 0.3.10
|Creation time: Mon Jan 15 16:00:24 2024
|ensembl_version: 111
|ensembl_host: localhost
|Organism: Gasterosteus aculeatus
|taxonomy_id: 69293
|genome_build: BROADS1
|DBSCHEMAVERSION: 2.2
|common_name: three-spined stickleback
|species: gasterosteus_aculeatus
| No. of genes: 22456.
| No. of transcripts: 29245.
|Protein data available.

> select(zz, genes, c("GENEID","SYMBOL","ENTREZID"))
              GENEID SYMBOL ENTREZID
1 ENSGACG00000014473  etv5b       NA
2 ENSGACG00000015168              NA
3 ENSGACG00000007529  CNNM1       NA

> gns2 <- tolower(mapIds(zz, genes, "SYMBOL","GENEID"))
> gns2
ENSGACG00000014473 
           "etv5b" 
ENSGACG00000015168 
                "" 
ENSGACG00000007529 
           "cnnm1" 
> library(org.Dr.eg.db)
> select(org.Dr.eg.db, gns2, "ENTREZID", "SYMBOL")
'select()' returned 1:1 mapping
between keys and columns
  SYMBOL ENTREZID
1  etv5b    30452
2            <NA>
3  cnnm1   562504

And then maybe you could do the KEGG analysis based on D. rerio instead?

ADD REPLY • link 21 months ago James W. MacDonald 68k