The code you post won't actually work, as 'genes' is not a valid argument for 'biomart'
> ensembl <- useEnsembl(biomart = "genes", dataset = "mmusculus_gene_ensembl")
Ensembl site unresponsive, trying asia mirror
Error in textConnection(text, encoding = "UTF-8") :
invalid 'text' argument
If I do it the right way I see the same error as you though.
> gns
[1] "ENSMUSG00000051951" "ENSMUSG00000089699" "ENSMUSG00000102331"
[4] "ENSMUSG00000102343" "ENSMUSG00000025900" "ENSMUSG00000025902"
> mart <- useEnsembl("ensembl","mmusculus_gene_ensembl")
Ensembl site unresponsive, trying useast mirror
Ensembl site unresponsive, trying uswest mirror <------ WUT
Ensembl site unresponsive, trying useast mirror
> getBM(c("ensembl_gene_id","mgi_symbol","description"), "ensembl_gene_id", gns, mart)
Error in .processResults(postRes, mart = mart, hostURLsep = sep, fullXmlQuery = fullXmlQuery, :
Query ERROR: caught BioMart::Exception::Database: Error during query execution: Table 'ensembl_mart_107.mmusculus_gene_ensembl__ox_mgi__dm' doesn't exist
Mike Smith will probably be along shortly with information about what's happening on the Biomart/Ensembl side of things. The issue with trying to query an online database is that sometimes it's down/not available/whatever. But there are other options! Johannes Rainer makes EnsDb
packages from each Ensembl version that you can get from the AnnotationHub
> library(AnnotationHub)
> hub <- AnnotationHub()
snapshotDate(): 2022-04-21
> query(hub, c("ensdb","mus musculus"))
AnnotationHub with 20 records
AH53222 | Ensembl 87 EnsDb for Mus Musculus
AH53726 | Ensembl 88 EnsDb for Mus Musculus
AH56691 | Ensembl 89 EnsDb for Mus Musculus
AH57770 | Ensembl 90 EnsDb for Mus Musculus
AH60788 | Ensembl 91 EnsDb for Mus Musculus
... ...
AH89211 | Ensembl 102 EnsDb for Mus musculus
AH89457 | Ensembl 103 EnsDb for Mus musculus
AH95775 | Ensembl 104 EnsDb for Mus musculus
AH98078 | Ensembl 105 EnsDb for Mus musculus
AH100674 | Ensembl 106 EnsDb for Mus musculus
> ensdb <- hub[["AH100674"]]
loading from cache
> select(ensdb, gns, c("GENENAME","DESCRIPTION"), "GENEID")
1 ENSMUSG00000051951 Xkr4
2 ENSMUSG00000089699 Gm1992
3 ENSMUSG00000102331 Gm19938
4 ENSMUSG00000102343 Gm37381
5 ENSMUSG00000025900 Rp1
6 ENSMUSG00000025902 Sox17
1 X-linked Kx blood group related 4 [Source:MGI Symbol;Acc:MGI:3528744]
2 predicted gene 1992 [Source:MGI Symbol;Acc:MGI:3780162]
3 predicted gene, 19938 [Source:MGI Symbol;Acc:MGI:5012123]
4 predicted gene, 37381 [Source:MGI Symbol;Acc:MGI:5610609]
5 retinitis pigmentosa 1 (human) [Source:MGI Symbol;Acc:MGI:1341105]
6 SRY (sex determining region Y)-box 17 [Source:MGI Symbol;Acc:MGI:107543]
Unlike a query to biomaRt
, the return order from select
is guaranteed to be the same as the input data, and by default always includes the input values as well. As a defensive programming step you might want to use merge
or match
when combining with other data to be doubly sure the order is OK, but that's up to you.
Thanks James.
I actually followed the script shown in the website and "genes" is used as a argument.
I appreciate the alternative way to solve my problem using
. But, as Mike posted below, my problem will be solved soon.Ah, good point! I didn't see that bomaRt had changed the choices for the 'biomart' argument. Thanks for pointing that out.