Hello, I am trying to retrieve ENTREZGENE IDs using ENSEMBL IDs as queries using biomaRt in R, but it does not retrieve them properly, instead it returns the HGNC IDs in place of the ENTREZGENE IDs.
See this MWE:
library(biomaRt) ensembl <- useMart("ENSEMBL_MART_ENSEMBL", dataset="hsapiens_gene_ensembl", host="www.ensembl.org") genes <- c("ENSG00000121671", "ENSG00000142208", "ENSG00000171051", "ENSG00000115271", "ENSG00000143537") getBM(attributes=c('ensembl_gene_id','entrezgene','hgnc_id','hgnc_symbol'), filters='ensembl_gene_id', values=genes, mart=ensembl)
It returns
ensembl_gene_id entrezgene hgnc_id hgnc_symbol
1 ENSG00000115271 GCA HGNC:15990 GCA
2 ENSG00000121671 CRY2 HGNC:2385 CRY2
3 ENSG00000142208 AKT1 HGNC:391 AKT1
4 ENSG00000143537 ADAM15 HGNC:193 ADAM15
5 ENSG00000171051 FPR1 HGNC:3826 FPR1
How should I do to retrieve the ENTREZGENE IDs correctly? Thanks.
Can you update your post to include the output from
sessionInfo()
? I'd like to check what version ofbiomaRt
you're using as I get the entrezgene IDs returned correctly.I get the same as the OP:
But this is what I get if I use the Biomart mirror at useast.ensembl.org, so it seems it hasn't updated yet. If I try to hack things to use ensembl.org, I get this:
Thanks for testing the code and reporting your findings. I suspect this is related to the issues reported (and fixed) in A: Ensembl 88 is out!
I has made me question the effectiveness of the
host
argument touseMart()
since I always seem to end up on the main ensembl site, presumably due to some geo-location redirections. I'll see if I can over ride this in thebiomaRt
code.Hi Mike,
I actually never realised this behaviour before. If this can be over ride in the biomaRt code that would be great. To over ride the automatic ensembl mirrors redirect, you can use the following flag in the URL: "?redirect=no". E.g:
http://uswest.ensembl.org/index.html?redirect=no
This should bring you straight to the uswest ensembl mirror.
Cheers,
Thomas
Thanks for the hint. This might be frustrating for all involved, but it has exposed an interesting 'feature'!
I'll take a look in the next few days - there's not much point in the argument if it silently doesn't work.
Sorry I'm just seeing the messages given the time difference... do you still need to see sessionInfo? I guess the answer here is to just wait right?
sessionInfo()
is always useful, and as a general rule you should include it in any post you make here as someone will inevitably ask you for it. But in this case it look like the issue is with your local ensembl mirror rather than thebiomaRt
package, so you'll have to wait either for the mirror to be updated or for me to figure out how to forcebiomaRt
to query the main site.I suspect the ensembl fix will come first, they're normally very good at sorting issues like this.