BiomaRt and phytozome
1
0
Entering edit mode
s.nielsen ▴ 20
@snielsen-20799
Last seen 2.5 years ago

Any idea how to connect to phytozome through biomart? It seems to be an Rcurl problem when digging into the biomaRt code. Specifically, getURL. Something to do with SSL certificates?

library(biomaRt) library(RCurl) phyto <- "https://phytozome.jgi.doe.gov" listMarts(host = phyto)

Request to BioMart web service failed. The BioMart web service you're accessing may be down. Check the following URL and see if this website is available: https://phytozome.jgi.doe.gov:80/biomart/martservice?type=registry&requestid=biomaRt Error in if (!grepl(x = registry, pattern = "^\n*<martregistry>")) { : argument is of length zero

getURL(phyto)

Error in function (type, msg, asError = TRUE) : error:1407742E:SSL routines:SSL23GETSERVER_HELLO:tlsv1 alert protocol version

url.exists(phyto)

[1] FALSE >

sessionInfo()

R version 3.5.1 (2018-07-02) Platform: x86_64-w64-mingw32/x64 (64-bit) Running under: Windows >= 8 x64 (build 9200)

Matrix products: default

locale: [1] LCCOLLATE=EnglishUnited Kingdom.1252 LCCTYPE=EnglishUnited Kingdom.1252 LCMONETARY=EnglishUnited Kingdom.1252 [4] LCNUMERIC=C LCTIME=English_United Kingdom.1252

attached base packages: [1] stats graphics grDevices utils datasets methods base

other attached packages: [1] RCurl1.95-4.12 bitops1.0-6 biomaRt_2.38.0

biomaRT Rcurl phytozome • 779 views
0
Entering edit mode
Mike Smith ★ 5.3k
@mike-smith
Last seen 2 hours ago
EMBL Heidelberg / de.NBI

Try setting the port to 443 since you're using https e.g.

listMarts(host = phyto, port = 443)

                    biomart                  version
1            phytozome_mart V12 Genomes and Families
2 phytozome_diversity__mart     V12 Genome Diversity
3    phytozome_mart_archive           Genome Archive


I thought I'd modified biomaRt to try and do this automatically, but maybe it was only for Ensembl. I'll add a patch to use the appropriate port for the protocol unless overridden by the user.

1
Entering edit mode

Hey Mike, So I'm following this up again. It turns out that there are issues with doing this on windows but not on mac (for me at least with all my troubleshooting). It is related to the RCurl SSL version on windows vs mac;

RCurl::curlVersion()$ssl_version [1] "OpenSSL/1.0.0o" ### Mac RCurl::curlVersion()$ssl_version [1] “LibreSSL/2.6.5”

I found that in windows curl itself uses a newer SSL version

curlversion()\$sslversion [1] "(OpenSSL/1.0.2n) WinSSL"

I made a small hack to listMarts that uses curl instead of RCurl to obtain the <martregistry> XML information, altered an error check and it worked (well, listing the marts worked)

... registry = do.call(paste, as.list(readLines(curl(request)))) if (!grepl(x = registry, pattern = "MartRegistry>")) ...

And this is the output

listMarts2(host = 'https://phytozome.jgi.doe.gov', port = 443)

.

biomart version 1 phytozomemart V12 Genomes and Families 2 phytozomediversity_mart V12 Genome Diversity 3 phytozomemart_archive Genome Archive

I'm not sure how to get RCurl to change its SSL version, but at least I got here.

0
Entering edit mode

Thanks for looking into this in such depth. It's interesting that RCurl for Windows ships with such an outdated version of SSL. I don't think there's anyway to force RCurl to use the correct version of the protocol without instaling libcurl yourself and building the package from source - that's a total pain!

However I'm not sure there's anyreason to prefer RCurl over curl, so I'll look into swapping one for the other inside biomaRt.

0
Entering edit mode

I've taken a look at the code, and I think I've actually already removed the parts that relied on RCurl. Can you update your version of biomaRt to at least version 2.40.0? If I use that on Windows it works fine with Phytozome.

1
Entering edit mode

Hey Mike,

Sorry for the delay in response, but yes it is all working nicely now on Windows! Thank you greatly for your fast response and fixing of this. This will make my life much easier.

0
Entering edit mode

I think there may be a network issue at play here? I still have the error. I'm currently at my university. I may try at home and see what happens.

0
Entering edit mode

What happens if you follow the instruction in the error message and go to https://phytozome.jgi.doe.gov:80/biomart/martservice?type=registry&requestid=biomaRt in a browser?

I expect that should fail with a protocol error, so maybe it's more informative to visit https://phytozome.jgi.doe.gov:443/biomart/martservice?type=registry&requestid=biomaRt

0
Entering edit mode

So with port 433, the web link works..

But through R it does not;

library(RCurl)

Error in function (type, msg, asError = TRUE) : error:1407742E:SSL routines:SSL23GETSERVER_HELLO:tlsv1 alert protocol version

I've asked some friends off campus to try, and it works for them so it must be something with my university internet (proxy settings or something. I'm about to ask IT).