Question

'biomaRt expected a character string of length 1' error with getBM and reseq mrna filer

0

Entering edit mode

laurenz.holcik • 0

@laurenzholcik-20105

Last seen 2.1 years ago

Austria

Hi all,

I am currently trying to convert the Illumina Human Methylation 450k annotation from the provided RefSeq IDs to ensembl gene IDs (if anybody has done this before and could give the the files that would be perfect!). So when I use biomaRt to get the ensembl IDs of the ~20000 mRNA RefSeq IDs I always get the error:

Error in getBM(attributes = c("refseq_mrna", "ensembl_gene_id"), filters = "refseq_mrna",  : 
   The query to the BioMart webservice returned an invalid result: biomaRt expected a character string of length 1.

This doesn't occur immediately but after about 50% of the query. I have tried to change the mart to

mart = mart <- useMart(biomart = "ENSEMBL_MART_ENSEMBL", 
                   dataset = "hsapiens_gene_ensembl", 
                   host = 'www.ensembl.org',
                   ensemblRedirect = FALSE)

as I read this helped for others but it did nothing for me. Is there an argument to give to getBM to just ignore this and go on with the next value? It does not matter if I don't get all of the IDs.

My query is:

bm = getBM(attributes=c('refseq_mrna', 'ensembl_gene_id'), 
  filters = 'refseq_mrna', 
  values = rsnm, 
  mart = mart)

    > head(rsnm)
[1] "NM_006521"    "NM_001166660" "NM_181303"    "NM_018977"    "NM_000117"    "NM_002547"

SessionInfo:

 > sessionInfo('biomaRt')
R version 3.6.0 (2019-04-26)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 18.04.2 LTS

Matrix products: default
BLAS/LAPACK: /opt/intel/compilers_and_libraries_2019.2.187/linux/mkl/lib/intel64_lin/libmkl_rt.so

Random number generation:
 RNG:     Mersenne-Twister 
 Normal:  Inversion 
 Sample:  Rounding 

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C               LC_TIME=de_AT.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=de_AT.UTF-8    LC_MESSAGES=en_US.UTF-8    LC_PAPER=de_AT.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C             LC_MEASUREMENT=de_AT.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
character(0)

other attached packages:
[1] biomaRt_2.40.0

biomart • 2.9k views

ADD COMMENT • link updated 4.9 years ago by Mike Smith ★ 6.5k • written 4.9 years ago by laurenz.holcik • 0

score 4 · Accepted Answer · 2019-05-23

4

Entering edit mode

Mike Smith ★ 6.5k

@mike-smith

Last seen 9 hours ago

EMBL Heidelberg

It might be that the main Ensembl site is being a bit slow today. You can try querying one of the mirror sites e.g.

mart <- useEnsembl(biomart = "ensembl", 
                   dataset = "hsapiens_gene_ensembl", 
                   mirror = "useast")

Values for the mirror argument are: useast, uswest, asia.

ADD COMMENT • link 4.9 years ago Mike Smith ★ 6.5k

0

Entering edit mode

Solved it, thank you!

ADD REPLY • link 4.9 years ago laurenz.holcik • 0

0

Entering edit mode

Thank you! I ran into the same problem, and using a mirror solved it.

ADD REPLY • link 4.9 years ago matt.dufort ▴ 10