Search
Question: biomaRt scan error for latest assembly when indicating version Number
0
3 months ago by
mullpaul0
Rockefeller University
mullpaul0 wrote:

I am running biomaRt version 2.35.11 and I get a scan error when I try to access the most current version of a dataset using a version number (version 91). Removing the version accesses the latest dataset, but is it possible to indicate the latest version so that running the code in the future will always access this dataset?

​> mart_obj <- useEnsembl(biomart="ensembl", dataset='mmusculus_gene_ensembl', version=91)
> mart_obj@host
[1] "http://dec2017.archive.ensembl.org:80/biomart/martservice"
> df <- getBM(attributes = c('ensembl_transcript_id', 'ensembl_gene_id', 'external_gene_name', 'chromosome_name', 'description'),
+             mart       = mart_obj)
Error in scan(file = file, what = what, sep = sep, quote = quote, dec = dec,  :
line 1 did not have 3 elements
modified 3 months ago by Mike Smith2.7k • written 3 months ago by mullpaul0

It looks like http://dec2017.archive.ensembl.org redirects you to http://www.ensembl.org which I'm pretty sure was not the behaviour before the latest release - I've definitely recommended using the archive URL for the current release for exactly the reason you state.

This is annoying as I recently turned of following redirection in biomaRt so queries end up where they're requested to be sent, but this breaks your query.  I'll have a look at the code and see if there's an obvious work around, but the most straight forward solution would be for Ensembl to let you use the dec2017 URL directly, rather than forcing the redirection.

Thanks a lot Mike! Not the end of the world if I just have to wait for 91 to get a true archive page, but agreed that this is pretty annoying.

1
3 months ago by
Mike Smith2.7k
EMBL Heidelberg / de.NBI
Mike Smith2.7k wrote:

This should now be fixed in biomaRt version 2.35.12.  If you either provide the current version number to useEnsembl(), or the equivalent URL to useMart(), this will be detected and adjusted appropriately. e.g.

> library(biomaRt)
>
> mart_obj <- useEnsembl(biomart="ensembl", dataset='mmusculus_gene_ensembl', version=91)
Note: requested host was redirected from
http://dec2017.archive.ensembl.org to http://www.ensembl.org:80/biomart/martservice
This often occurs when connecting to the archive URL for the current Ensembl release
You can check the current version number using listEnsemblArchives()
>
> df <- getBM(attributes = c('ensembl_transcript_id', 'ensembl_gene_id'),
+      mart       = mart_obj)
>
> dim(df)
[1] 135075      2


This uses the list of archives present at https://www.ensembl.org/info/website/archives/index.html to determine the current release. When the next Ensembl release comes out biomaRt will no longer redirect your URL and will stick with the specified archive version - so your results should stay stable over time.

Let me know if this throws up any issues.