getLDS across both "normal" Ensembl biomart and plants.ensembl.org biomart
1
0
Entering edit mode
devmoy • 0
@devmoy-16285
Last seen 4.0 years ago

I am trying to find orthologs between a variety of plant and animal species, including Arabidopsis thaliana and humans. When I try to use getLDS with both the Arabidopsis genes dataset from the plants_mart at plants.ensembl.org and the human genes dataset at ensembl.org, I get an error saying that the Arabidopsis gene dataset was not found. I am reasonably confident that this is due to the fact that the two marts have different hosts, and am prepared to hear that I just can't execute getLDS on two marts with different hosts, but I figured that I would ask and see if I was missing something, just in case. 

> library("biomaRt")
> ensembl = useMart("ENSEMBL_MART_ENSEMBL", dataset="hsapiens_gene_ensembl")
> plants = useMart("plants_mart", host="plants.ensembl.org", dataset="athaliana_eg_gene")
> getLDS(attributes="ensembl_gene_id", mart=ensembl, attributesL="ensembl_gene_id", martL=plants)
Error in getLDS(attributes = "ensembl_gene_id", mart = ensembl, attributesL = "ensembl_gene_id",  :
  Query ERROR: caught BioMart::Exception::Usage: WITHIN Virtual Schema : default, Dataset athaliana_eg_gene NOT FOUND
> listDatasets(plants)[3:3,1]
[1] "athaliana_eg_gene"
> sessionInfo()
R version 3.5.0 (2018-04-23)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: CentOS release 6.5 (Final)

Matrix products: default
BLAS/LAPACK: /cm/shared/apps/OpenBLAS/current/lib/libopenblas_sandybridgep-r0.2.14.so

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C
 [9] LC_ADDRESS=C               LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

other attached packages:
[1] biomaRt_2.36.1

loaded via a namespace (and not attached):
 [1] Rcpp_0.12.17         AnnotationDbi_1.42.1 magrittr_1.5
 [4] BiocGenerics_0.26.0  hms_0.4.2            progress_1.2.0
 [7] IRanges_2.14.10      bit_1.1-14           R6_2.2.2
[10] rlang_0.2.1          httr_1.3.1           stringr_1.3.1
[13] blob_1.1.1           tools_3.5.0          parallel_3.5.0
[16] Biobase_2.40.0       DBI_1.0.0            bit64_0.9-7
[19] digest_0.6.15        assertthat_0.2.0     crayon_1.3.4
[22] S4Vectors_0.18.3     bitops_1.0-6         curl_3.2
[25] RCurl_1.95-4.10      memoise_1.1.0        RSQLite_2.1.1
[28] stringi_1.2.3        compiler_3.5.0       prettyunits_1.0.2
[31] stats4_3.5.0         XML_3.98-1.11        pkgconfig_2.0.1
R biomart • 492 views
ADD COMMENT
1
Entering edit mode
Mike Smith ★ 5.6k
@mike-smith
Last seen 11 hours ago
EMBL Heidelberg / de.NBI

Yes, you're correct that the two datasets must be on the same host, the linking between the various marts is store internal within the BioMart service, but there's not connection between separate BioMart instances.  The BioMart instances provided at www.ensembl.org and plants.ensembl.org are distinct, so you can't use getLDS() to query them together.

I've updated the package (version 2.37.3) to catch this and hopefully provide a more informative error message:

> getLDS(attributes="ensembl_gene_id", mart=ensembl, 
+        attributesL="ensembl_gene_id", martL=plants)
Error: Both datasets must be located on the same host.
ADD COMMENT

Login before adding your answer.

Traffic: 377 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6