Question: BioMart webservices broken?
0
5 months ago by
mlee0
mlee0 wrote:

I am trying to convert from ensembl ID to gene symbol. Here is the chunk of code I'm using:

ensembl = useMart( "ensembl", dataset = "hsapiens_gene_ensembl")
genemap <- getBM( attributes = c("ensembl_gene_id", "entrezgene", "hgnc_symbol"),
filters = "ensembl_gene_id",
values = dat$ensembl, mart = ensembl ) idx <- match( dat$ensembl, genemap$ensembl_gene_id ) dat$entrez <- genemap$entrezgene[idx] dat$hgnc_symbol <- genemap\$hgnc_symbol[idx]
dat <- subset(dat, select = -c(ensembl, entrez))


I get this error when I run it:

The query to the BioMart webservice returned an invalid result: biomaRt expected a character string of length 1.
Please report this on the support site at http://support.bioconductor.org


Anyone able to help? Thanks.

modified 5 months ago by James W. MacDonald52k • written 5 months ago by mlee0

When you tag a post with a package name, it sends an email alert to the package maintainer. As this isn't a DESeq2 post, I'm removing the tag.

1
5 months ago by
United States
James W. MacDonald52k wrote:

Use useEnsembl, rather than useMart, and choose the closest Ensembl mirror to you. But do choose a mirror, as the main site seems to be the (consistent) problem here.

Thanks for the response. This worked.

Glad it worked for you. Can you add the output of sessionInfo() here. I updated biomaRt in the last few days to try and use the nearest mirror automatically, so I'd like to see if if you have that version.

This is really just a side question - why is it important to choose the mirror next to you? what is the difference which mirror site one takes, as long as it is online?

I don't work for Ensembl, so this is just a guess, but based on the distribution of the mirrors (uswest, useast, europe, asia) I think the hope is that this will fairly evenly distribute load across sites, although time zone differences will undermine that a bit. It should also be marginally quicker to access a site hosted nearby, as the physical distance packets will have to travel from your machine to the site will probably be lower, but I have no data on how much difference that will actually make.

From a results point of view it doesn't make a difference, they should be identical, so go with whatever works.

I still get the same error as James W. MacDonald. Is there any other potential reason why I could be getting this error? Is there a filesize (number of rows) limit perhaps?

I am having the same issue as James W. MacDonald. My command was working for months and my query would take 1-2min to run. But now it starts, says it will take 20-30min and before it reaches the end, it fails.