Error when trying to access BiomaRt from a cluster
1
0
Entering edit mode
camerond • 0
@camerond-15316
Last seen 5 weeks ago
United Kingdom

I'm trying to run use BiomaRt on a Slurm cluster. When I run the following basic command in R (3.5.1) from my home folder it works fine.

genemart = useMart(host="www.ensembl.org",
               biomart = "ENSEMBL_MART_ENSEMBL",
               dataset="hsapiens_gene_ensembl")

However, when I send the job to the cluster it fails with the following error message:

Request to BioMart web service failed.
The BioMart web service you're accessing may be down.
Check the following URL and see if this website is available:
http://www.ensembl.org:80/biomart/martservice?type=registry&requestid=biomaRt
Error in if (!grepl(x = registry, pattern = "^\n*<MartRegistry>")) { : 
argument is of length zero
Calls: useMart -> listMarts
Execution halted

I have tried to run this from a different port (port = 443), as suggested here, but I get the same error message.

This command is the first step in quite a big list of queries that I have so I can't just run the entire script from my home folder. I can't work out why this error is occurring as I'm loading the same version of R, and the using the same version of BiomaRt (2.38.0).

Any ideas on how to get around this would be greatly appriciated.

biomaRt R cluster • 1.2k views
ADD COMMENT
0
Entering edit mode

The package developer may respond but have you also tried host = 'uswest.ensembl.org' or host = 'uswest.ensembl.org'?

ADD REPLY
0
Entering edit mode

Many thanks for the suggestion. I have tried it, but no joy unfortunately. Unsure why this works when I run it locally but not when sent to the cluster???

ADD REPLY
0
Entering edit mode
@martin-morgan-1513
Last seen 4 days ago
United States

Probably there is something about internet access from your cluster. Try to access the list of marts directly

url = "http://www.ensembl.org:80/biomart/martservice?type=registry&requestid=biomaRt"
response = readLines(url)

response should be a character vector with contents like (abbreviated)

> response
> response
 [1] ""
 [2] "<MartRegistry>"
 [3] "  <MartURLLocation database=\"ensembl_mart_100\" default=\"1\" displayName=\"Ensembl Genes 100\" host=\"www.ensembl.org\" includeDatasets=\"\" martUser=\"\" name=\"ENSEMBL_MART_ENSEMBL\" path=\"/biomart/martservice\" port=\"80\" serverVirtualSchema=\"default\" visible=\"1\" />"
...
[10] "</MartRegistry>"

but I guess you actual receive an error response of some sort...

ADD COMMENT
0
Entering edit mode

Yes, you are right. It threw an error. I'll ask our cluster support if there is a work around. Looking into this a bit further it seems BiomaRt could not even be loaded using library(biomaRt) command so it was failing a bit earlier than I thought. Many thanks for your tip.

Error in file(con, "r") : 
cannot open the connection to 'http://www.ensembl.org:80/biomart/martservice?type=registry&requestid=biomaRt'
Calls: readLines -> file
In addition: Warning message:
In file(con, "r") :
URL 'http://www.ensembl.org:80/biomart/martservice?type=registry&requestid=biomaRt': status was 'Couldn't connect to server'
Execution halted
ADD REPLY
0
Entering edit mode

Wait, just to be sure, the package is called biomaRt, so, to load, you'd need library(biomaRt) (not library(BiomaRt))

ADD REPLY
0
Entering edit mode

Yes, sorry, that was a typo in the post. I have modified to now.

ADD REPLY

Login before adding your answer.

Traffic: 549 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6