biomaRt: list of hosts accessed
1
1
Entering edit mode
@ashokragavendran-13553
Last seen 6.7 years ago

   Hello all,

    I am trying to use the biomaRt package from within a secure computing environment. In this context we will have to configure the firewalls only to specific IPs to maintain compliance with security procedures. From my understanding of how the biomaRt package works a request is sent to 

www.ensembl.org/biomart/martservice 

which returns an xml results containing the various datasets and the hosts they are located on. So from what I understand, the hostname is not hard-coded into the package other than the path to the mart service. Am in right in this presumption? or do I misunderstand the code in the package.  

I originally tried contact Ensembl help desk about this, as from my previous experience they have been very responsive and knowledgeable. However,  this time the support person kept insisting that I find out what servers are being accessed by biomaRt. 
I will be much obliged for any help in this regard and any insights from the community

  Cheers

Ashok

 

biomart ensembl • 2.8k views
ADD COMMENT
1
Entering edit mode
Mike Smith ★ 6.5k
@mike-smith
Last seen 2 hours ago
EMBL Heidelberg

You're correct that the host name is not hard coded into the package (at least it shouldn't be).  biomaRt (the R package) is designed as a tool to access any server that is running BioMart (the service) to provide access to its data.  The host argument is used to specify this in most functions.  Ensembl is by far the most widely used BioMart instance, and so www.ensembl.org is typically used as a default, but you specify the URL you require.

However, you're also correct that the value given to the host argument isn't necessarily the final address of the data.  When selecting a mart an initial query is made the the XML registry of marts held on the specified server.  Each entry then has a subsequent host="" value which is really where the data are accessed.  You can view the Ensembl registry entry here:

http://www.ensembl.org/biomart/martservice?type=registry

If you need to view a different BioMart server you can typically use the same URL format as this, just substitute the www.ensembl.org part for the server you're using.  However in my experience it's very rare to find a host="" value in the registry that isn't the same as the initial host (I've seen it only once, and in that instance it was broken anyway).


One caveat if you're trying to access Ensembl, is that your requests may be redirected to the nearest mirror geographically.  Thus you can end up accessing useast.ensembl.org, uswest.ensembl.org or asia.ensembl.org even if you explicitly set host = "www.ensembl.org".  To avoid this you can use the argument ensemblRedirect = FALSE.

ADD COMMENT

Login before adding your answer.

Traffic: 839 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6