I am attempting to download some data from GEO using the getBM()
function from the biomaRt package. I am downloading multiple different GEO accessions, and some of them give the following error:
The query to the BioMart webservice returned an invalid result: biomaRt expected a character string of length 1.
Please report this on the support site at http://support.bioconductor.org
Here is an example:
### Dependencies
library(GEOquery)
library(Biobase)
library(biomaRt)
### Set up Mart
mart <- useMart('ENSEMBL_MART_ENSEMBL')
mart <- useDataset('hsapiens_gene_ensembl', mart)
### Get sets (returns list of two ExpressionSets)
gset_ls <- getGEO("GSE4922", GSEMatrix = T, getGPL = F)
names(gset_ls)
[1] "GSE4922-GPL96_series_matrix.txt.gz" "GSE4922-GPL97_series_matrix.txt.gz"
### Select first one
myGSET <- gset_ls$GSE4922-GPL96_series_matrix.txt.gz
### Get annotation lookup
myLookup <- getBM(mart = mart,
attributes = c("affy_hg_u133a", "ensembl_gene_id", "external_gene_name"),
filter = "affy_hg_u133a",
values = rownames(exprs(myGSET)),
uniqueRows = T)
The getBM()
function begins the download and works for a while, but then ends with:
Batch submitting query [=======================>----] 84% eta: 1mError in biomaRt::getBM(mart = mart, attributes = c("affy_hg_u133a, "ensembl_gene_id", :
The query to the BioMart webservice returned an invalid result: biomaRt expected a character string of length 1.
Please report this on the support site at http://support.bioconductor.org
This exact same method works for other GEO accession numbers and arrays, only some accessions give this error. For example, using the GEO accession "GSE19615" and the GPL of "GPL570". This accession uses a different array, so the filter is "affy_hg_u133_plus_2"
GEO = "GSE19615"
GPL = "GPL570"
FILTER = "affy_hg_u133_plus_2
Session Info:
R version 3.6.0 (2019-04-26)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS High Sierra 10.13.6
Matrix products: default
BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.6/Resources/lib/libRlapack.dylib
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
attached base packages:
[1] parallel stats graphics grDevices utils datasets methods
[8] base
other attached packages:
[1] readxl_1.3.1 wrh.rUtils_0.0.0.9000 data.table_1.12.2
[4] biomaRt_2.40.0 GEOquery_2.52.0 usethis_1.5.0
[7] devtools_2.0.2 Biobase_2.44.0 BiocGenerics_0.30.0
loaded via a namespace (and not attached):
[1] progress_1.2.2 tidyselect_0.2.5 remotes_2.0.4
[4] purrr_0.3.2 stats4_3.6.0 blob_1.1.1
[7] XML_3.98-1.20 rlang_0.3.4 pkgbuild_1.0.3
[10] pillar_1.4.1 glue_1.3.1 withr_2.1.2
[13] DBI_1.0.0 bit64_0.9-7 sessioninfo_1.1.1
[16] stringr_1.4.0 cellranger_1.1.0 memoise_1.1.0
[19] callr_3.2.0 IRanges_2.18.1 ps_1.3.0
[22] curl_3.3 AnnotationDbi_1.46.0 Rcpp_1.0.1
[25] readr_1.3.1 backports_1.1.4 BiocManager_1.30.4
[28] limma_3.40.2 desc_1.2.0 S4Vectors_0.22.0
[31] pkgload_1.0.2 fs_1.3.1 bit_1.1-14
[34] hms_0.4.2 digest_0.6.19 stringi_1.4.3
[37] processx_3.3.1 dplyr_0.8.1 rprojroot_1.3-2
[40] cli_1.1.0 tools_3.6.0 bitops_1.0-6
[43] magrittr_1.5 RCurl_1.95-4.12 tibble_2.1.3
[46] RSQLite_2.1.1 crayon_1.3.4 tidyr_0.8.3
[49] pkgconfig_2.0.2 xml2_1.2.0 prettyunits_1.0.2
[52] assertthat_0.2.1 httr_1.4.0 rstudioapi_0.10
[55] R6_2.4.0 compiler_3.6.0
That worked great. Thanks, Mike!