bioMart "The number of columns in the result table does not equal the number of attributes in the query."
Entering edit mode
noah.reid ▴ 10
Last seen 17 months ago
United States

I have some code for extracting annotation information from Ensembl and it has ceased working with the usual hostname:

# Ensembl gene IDs from fundulus heteroclitus release 107. 
gids <- c("ENSFHEG00000014345","ENSFHEG00000014326","ENSFHEG00000014282","ENSFHEG00000014227","ENSFHEG00000014113","ENSFHEG00000014098","ENSFHEG00000014086","ENSFHEG00000014072","ENSFHEG00000014058","ENSFHEG00000014032","ENSFHEG00000013967","ENSFHEG00000000334","ENSFHEG00000013942","ENSFHEG00000013847","ENSFHEG00000013795","ENSFHEG00000013774","ENSFHEG00000013756","ENSFHEG00000013549","ENSFHEG00000013406","ENSFHEG00000013399")

ensemblhost <- ""

killi_mart <- useMart(biomart = "ENSEMBL_MART_ENSEMBL", host = ensemblhost, dataset = "fheteroclitus_gene_ensembl")

ann <- getBM(filter="ensembl_gene_id",value=gids,attributes=c("ensembl_gene_id","description","transcript_length"),mart=killi_mart)

This fails with error:

Error in .processResults(postRes, mart = mart, hostURLsep = sep, fullXmlQuery = fullXmlQuery,  : 
  The query to the BioMart webservice returned an invalid result.
The number of columns in the result table does not equal the number of attributes in the query.
Please report this on the support site at

If I choose an "archive" hostname, the exact same code works, even though the July archive is still 107 (I think?):

gids <- c("ENSFHEG00000014345","ENSFHEG00000014326","ENSFHEG00000014282","ENSFHEG00000014227","ENSFHEG00000014113","ENSFHEG00000014098","ENSFHEG00000014086","ENSFHEG00000014072","ENSFHEG00000014058","ENSFHEG00000014032","ENSFHEG00000013967","ENSFHEG00000000334","ENSFHEG00000013942","ENSFHEG00000013847","ENSFHEG00000013795","ENSFHEG00000013774","ENSFHEG00000013756","ENSFHEG00000013549","ENSFHEG00000013406","ENSFHEG00000013399")

ensemblhost <- ""

killi_mart <- useMart(biomart = "ENSEMBL_MART_ENSEMBL", host = ensemblhost, dataset = "fheteroclitus_gene_ensembl")

ann <- getBM(filter="ensembl_gene_id",value=gids,attributes=c("ensembl_gene_id","description","transcript_length"),mart=killi_mart)

Am I doing something wrong here?

here's the results of sessionInfo()

R version 4.2.1 (2022-06-23)
Platform: x86_64-apple-darwin17.0 (64-bit)
Running under: macOS Monterey 12.2.1

Matrix products: default
LAPACK: /Library/Frameworks/R.framework/Versions/4.2/Resources/lib/libRlapack.dylib

[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats4    stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] biomaRt_2.52.0              goseq_1.48.0                geneLenDataBase_1.32.0     
 [4] BiasedUrn_1.07              ashr_2.2-54                 ggrepel_0.9.1              
 [7] forcats_0.5.2               stringr_1.4.1               dplyr_1.0.9                
[10] purrr_0.3.4                 readr_2.1.2                 tidyr_1.2.0                
[13] tibble_3.1.8                ggplot2_3.3.6               tidyverse_1.3.2            
[16] pheatmap_1.0.12             apeglm_1.18.0               DESeq2_1.36.0              
[19] SummarizedExperiment_1.26.1 Biobase_2.56.0              MatrixGenerics_1.8.1       
[22] matrixStats_0.62.0          GenomicRanges_1.48.0        GenomeInfoDb_1.32.3        
[25] IRanges_2.30.1              S4Vectors_0.34.0            BiocGenerics_0.42.0        

loaded via a namespace (and not attached):
  [1] googledrive_2.0.0        colorspace_2.0-3         rjson_0.2.21             ellipsis_0.3.2          
  [5] XVector_0.36.0           fs_1.5.2                 rstudioapi_0.13          farver_2.1.1            
  [9] bit64_4.0.5              AnnotationDbi_1.58.0     fansi_1.0.3              mvtnorm_1.1-3           
 [13] lubridate_1.8.0          xml2_1.3.3               codetools_0.2-18         splines_4.2.1           
 [17] cachem_1.0.6             geneplotter_1.74.0       jsonlite_1.8.0           Rsamtools_2.12.0        
 [21] broom_1.0.0              annotate_1.74.0          GO.db_3.15.0             dbplyr_2.2.1            
 [25] png_0.1-7                compiler_4.2.1           httr_1.4.4               backports_1.4.1         
 [29] assertthat_0.2.1         Matrix_1.4-1             fastmap_1.1.0            gargle_1.2.0            
 [33] cli_3.3.0                prettyunits_1.1.1        tools_4.2.1              coda_0.19-4             
 [37] gtable_0.3.0             glue_1.6.2               GenomeInfoDbData_1.2.8   rappdirs_0.3.3          
 [41] Rcpp_1.0.9               bbmle_1.0.25             cellranger_1.1.0         vctrs_0.4.1             
 [45] Biostrings_2.64.1        nlme_3.1-159             rtracklayer_1.56.1       rvest_1.0.3             
 [49] lifecycle_1.0.1          irlba_2.3.5              restfulr_0.0.15          XML_3.99-0.10           
 [53] googlesheets4_1.0.1      zlibbioc_1.42.0          MASS_7.3-58.1            scales_1.2.1            
 [57] hms_1.1.2                parallel_4.2.1           RColorBrewer_1.1-3       curl_4.3.2              
 [61] yaml_2.3.5               memoise_2.0.1            emdbook_1.3.12           bdsmatrix_1.3-6         
 [65] stringi_1.7.8            RSQLite_2.2.16           SQUAREM_2021.1           genefilter_1.78.0       
 [69] BiocIO_1.6.0             filelock_1.0.2           GenomicFeatures_1.48.3   BiocParallel_1.30.3     
 [73] truncnorm_1.0-8          rlang_1.0.4              pkgconfig_2.0.3          bitops_1.0-7            
 [77] lattice_0.20-45          invgamma_1.1             labeling_0.4.2           GenomicAlignments_1.32.1
 [81] bit_4.0.4                tidyselect_1.1.2         plyr_1.8.7               magrittr_2.0.3          
 [85] R6_2.5.1                 generics_0.1.3           DelayedArray_0.22.0      DBI_1.1.3               
 [89] mgcv_1.8-40              pillar_1.8.1             haven_2.5.0              withr_2.5.0             
 [93] survival_3.4-0           KEGGREST_1.36.3          RCurl_1.98-1.8           mixsqp_0.3-43           
 [97] modelr_0.1.9             crayon_1.5.1             utf8_1.2.2               BiocFileCache_2.4.0     
[101] tzdb_0.3.0               progress_1.2.2           locfit_1.5-9.6           grid_4.2.1              
[105] readxl_1.4.1             blob_1.2.3               digest_0.6.29            reprex_2.0.2            
[109] xtable_1.8-4             numDeriv_2016.8-1.1      munsell_0.5.0
biomaRt • 626 views
Entering edit mode
Mike Smith ★ 6.4k
Last seen 17 hours ago
EMBL Heidelberg

You are correct that at the moment the 107 archive URL will give the same results as using as the current version without any arhive/version number.

What I suspect is happening is that when you specify your query actually gets reidrected to you local ensembl mirror (useast or uswest I'd guess) whereas when you specify the complete archive URL that resolves to the main Ensembl site. If there's something wrong with the local mirror you might expect to see different behaviours from these very similar queries. However, I'd expect a problem with the mirror to more likely result in a "page not found" type error, although maybe biomaRt is just doing a bad job of converying that.

You could try to prevent the redirection (if that's what's happening) by using useEnsembl() instead of useMart() and providing the mirror argument e.g.

killi_mart <- useEnsembl(biomart = "ENSEMBL_MART_ENSEMBL", dataset = "fheteroclitus_gene_ensembl", mirror = "www")

Login before adding your answer.

Traffic: 422 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6