Search
Question: biomaRt - getBM() taking longer to run than usual ?
0
gravatar for Pierre-François Roux
3 months ago by
France
Pierre-François Roux0 wrote:

Dear BioC community,

I recently updated my R installation as well as all my BioC packages, including biomaRt.

While before updated, the code bellow (for microarray probe annotation, 40k probes) ran in less than one minutes, it now takes more than half an hour.

library(biomaRt)

probes <- row.names(data_rma_ets1_f)

# Connexion to BioMart hg19 (aka GRCh37)
ensembl <- useMart(biomart="ENSEMBL_MART_ENSEMBL",
                   host="grch37.ensembl.org",
                   path="/biomart/martservice",
                   dataset="hsapiens_gene_ensembl")

# Querying Biomart to map probe names to various features
annotation_ensembl <- getBM(attributes = c("affy_hta_2_0",
                                           "ensembl_gene_id",
                                           "chromosome_name",
                                           "start_position",
                                           "end_position",
                                           "strand",
                                           "entrezgene",
                                           "hgnc_symbol"),
                            filters = "affy_hta_2_0",
                            values = gsub("\\.1", "", probes),
                            mart = ensembl)
Batch submitting query [==-----------------------------------------------------------------------------------------------]   2% eta: 40m

> sessionInfo()
R version 3.4.4 (2018-03-15)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS High Sierra 10.13.5

Matrix products: default
BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.4/Resources/lib/libRlapack.dylib

locale:
[1] fr_FR.UTF-8/fr_FR.UTF-8/fr_FR.UTF-8/C/fr_FR.UTF-8/fr_FR.UTF-8

attached base packages:
[1] stats4    parallel  stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] rtracklayer_1.38.3                      EnsDb.Hsapiens.v75_2.99.0               ensembldb_2.2.2                        
 [4] AnnotationFilter_1.2.0                  TxDb.Hsapiens.UCSC.hg19.knownGene_3.2.2 GenomicFeatures_1.30.3                 
 [7] AnnotationDbi_1.40.0                    GenomicRanges_1.30.3                    GenomeInfoDb_1.14.0                    
[10] biomaRt_2.36.1                          pd.hta.2.0_3.12.2                       DBI_1.0.0                              
[13] RSQLite_2.1.1                           oligo_1.42.0                            Biostrings_2.46.0                      
[16] XVector_0.18.0                          IRanges_2.12.0                          S4Vectors_0.16.0                       
[19] Biobase_2.38.0                          oligoClasses_1.40.0                     BiocGenerics_0.24.0                    

loaded via a namespace (and not attached):
 [1] Rcpp_0.12.17                  lattice_0.20-35               prettyunits_1.0.2             Rsamtools_1.30.0             
 [5] assertthat_0.2.0              digest_0.6.15                 foreach_1.4.4                 mime_0.5                     
 [9] R6_2.2.2                      httr_1.3.1                    BiocInstaller_1.28.0          zlibbioc_1.24.0              
[13] progress_1.1.2                curl_3.2                      lazyeval_0.2.1                blob_1.1.1                   
[17] Matrix_1.2-14                 preprocessCore_1.40.0         splines_3.4.4                 RMySQL_0.10.15               
[21] BiocParallel_1.12.0           AnnotationHub_2.10.1          stringr_1.3.1                 ProtGenerics_1.10.0          
[25] RCurl_1.95-4.10               bit_1.1-14                    shiny_1.1.0                   DelayedArray_0.4.1           
[29] compiler_3.4.4                httpuv_1.4.3                  pkgconfig_2.0.1               htmltools_0.3.6              
[33] SummarizedExperiment_1.8.1    GenomeInfoDbData_1.0.0        interactiveDisplayBase_1.16.0 ff_2.2-14                    
[37] codetools_0.2-15              matrixStats_0.53.1            XML_3.98-1.11                 later_0.7.2                  
[41] GenomicAlignments_1.14.2      bitops_1.0-6                  grid_3.4.4                    xtable_1.8-2                 
[45] magrittr_1.5                  stringi_1.2.2                 promises_1.0.1                affyio_1.48.0                
[49] iterators_1.0.9               tools_3.4.4                   bit64_0.9-7                   yaml_2.1.19                  
[53] memoise_1.1.0                 affxparser_1.50.0   

Is there anything I can do to make the annotation process faster with the last release of biomaRt ?

Many thanks for your advices !

Cheers ! 

-Pef-

 

ADD COMMENTlink modified 3 months ago • written 3 months ago by Pierre-François Roux0

How old was your previous version?  Did it used to print the 'batch submitting' message?  The previous versions that didn't use the batch submission method where faster, but had a tendency to silently drop results when you queried for more than 500 values at a time.

ADD REPLYlink written 3 months ago by Mike Smith2.9k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.2.0
Traffic: 225 users visited in the last hour