Issue with BiomaRt's getLDS: Query ERROR: caught BioMart::Exception::Query: returning undef ... missing attributes for your exportable
eefernan5 • 0
Last seen 12 months ago

Hello all,

I'm trying to map homologs with biomaRt, but running into an issue and I haven't found a specific solution on my search. I kept getting an error when using the mart that I'd like to use, but when using a second incompatible mart (sheep2), the function works fine.

No issue retrieving gene names and ids when using an old archive

sheep <- useMart("ensembl", dataset = "oaries_gene_ensembl", host="")

sheep2 <- useMart("ensembl", dataset = "oarambouillet_gene_ensembl", host="")

human <- useMart("ensembl", dataset = "hsapiens_gene_ensembl", host="")
mapped_ids <- getLDS(attributes = c("ensembl_gene_id",'external_gene_name'), 
                     filters = "ensembl_gene_id",
                     values = sheep_ensembl_ids, 
                     mart = sheep, 
                     attributesL = c("ensembl_gene_id",'external_gene_name','gene_biotype'), 
                     martL = human)

Returns this error

Error in getLDS(attributes = c("ensembl_gene_id", "external_gene_name"),  : 
  Query ERROR: caught BioMart::Exception::Query: returning undef ... missing attributes for your exportable?

I noticed this by accident when I used the wrong sheep gene ensembl.

mapped_ids2 <- getLDS(attributes = c("ensembl_gene_id",'external_gene_name'), 
                     filters = "ensembl_gene_id",
                     values = sheep_ensembl_ids, 
                     mart = sheep2, 
                     attributesL = c("ensembl_gene_id",'external_gene_name','gene_biotype'), 
                     martL = human)

The function goes through, although nothing is returned because my data uses the oaries_gene_ensembl ID. The both the human and second sheep ensembl were Large Mart objects, while my oaries_gene_ensembl is a "Formal Class Mart". I don't know if that matters.

For reproducibility you can use below, or remove the filter argument from getLDS.

sheep_ensembl_ids = c("ENSOARG00000000002", "ENSOARG00000000006", "ENSOARG00000000016", "ENSOARG00000000019", "ENSOARG00000000022")

Session Info

R version 4.3.2 (2023-10-31 ucrt)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 11 x64 (build 22621)

Matrix products: default

[1] LC_COLLATE=English_United States.utf8  LC_CTYPE=English_United States.utf8   
[3] LC_MONETARY=English_United States.utf8 LC_NUMERIC=C                          
[5] LC_TIME=English_United States.utf8    

time zone: Australia/Brisbane
tzcode source: internal

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] biomaRt_2.58.2

loaded via a namespace (and not attached):
 [1] rappdirs_0.3.3          utf8_1.2.4              generics_0.1.3          bitops_1.0-7           
 [5] xml2_1.3.6              RSQLite_2.3.5           stringi_1.8.3           hms_1.1.3              
 [9] digest_0.6.34           magrittr_2.0.3          evaluate_0.23           fastmap_1.1.1          
[13] blob_1.2.4              progress_1.2.3          AnnotationDbi_1.64.1    GenomeInfoDb_1.38.6    
[17] DBI_1.2.1               BiocManager_1.30.22     httr_1.4.7              fansi_1.0.6            
[21] XML_3.99-0.16.1         Biostrings_2.70.2       cli_3.6.1               rlang_1.1.3            
[25] crayon_1.5.2            dbplyr_2.4.0            XVector_0.42.0          Biobase_2.62.0         
[29] bit64_4.0.5             yaml_2.3.8              cachem_1.0.8            tools_4.3.2            
[33] memoise_2.0.1           dplyr_1.1.4             GenomeInfoDbData_1.2.11 filelock_1.0.3         
[37] BiocGenerics_0.48.1     curl_5.2.0              vctrs_0.6.5             R6_2.5.1               
[41] png_0.1-8               stats4_4.3.2            lifecycle_1.0.4         BiocFileCache_2.10.1   
[45] zlibbioc_1.48.0         KEGGREST_1.42.0         stringr_1.5.1           S4Vectors_0.40.2       
[49] IRanges_2.36.0          bit_4.0.5               pkgconfig_2.0.3         pillar_1.9.0           
[53] data.table_1.15.0       glue_1.7.0              xfun_0.42               tibble_3.2.1           
[57] tidyselect_1.2.0        knitr_1.45              rstudioapi_0.15.0       htmltools_0.5.7        
[61] rmarkdown_2.25          compiler_4.3.2          prettyunits_1.2.0       RCurl_1.98-1.14
Mike Smith ★ 6.6k
Last seen 17 days ago
EMBL Heidelberg

If you ever get an error with biomaRt that looks like this, it's actually coming from the Ensembl BioMart server itself.

Query ERROR: caught BioMart::Exception::Query: returning undef ... missing attributes for your exportable?

It's almost certainly not an issue with your code, but something server side and it's pretty hard to have any idea what might be happening.

In this case, it might be because the oaries_gene_ensembl dataset doesn't have any homolog information in it. You can see this by looking at the available attribute 'pages':

> unique(listAttributes(sheep)$page)
[1] "feature_page" "structure"    "snp"          "sequences"   
> unique(listAttributes(sheep2)$page)
[1] "feature_page" "structure"    "homologs"     "snp"          "sequences"   
> unique(listAttributes(human)$page)
[1] "feature_page" "structure"    "homologs"     "snp"          "sequences"

On the website this manifests as a missing radio button when choosing attributes. It's only a guess that this is what's causing the issue and you'll have to contact the Ensembl team directly to understand why that genome doesn't have homolog information available. Perhaps it's because oaries is treated as a breed and oarambouillet is the primary genome assembly.

Thank you! I haven't gotten too familar with the mart's structure so I wouldn't have figured that out easily. I'll try to follow up with Ensembl.


