biomaRt - getBM() taking longer to run than usual ?
0
0
Entering edit mode
@pierre-francois-roux-7997
Last seen 5.9 years ago
France

Dear BioC community,

I recently updated my R installation as well as all my BioC packages, including biomaRt.

While before updated, the code bellow (for microarray probe annotation, 40k probes) ran in less than one minutes, it now takes more than half an hour.

library(biomaRt)

probes <- row.names(data_rma_ets1_f)

# Connexion to BioMart hg19 (aka GRCh37)
ensembl <- useMart(biomart="ENSEMBL_MART_ENSEMBL",
                   host="grch37.ensembl.org",
                   path="/biomart/martservice",
                   dataset="hsapiens_gene_ensembl")

# Querying Biomart to map probe names to various features
annotation_ensembl <- getBM(attributes = c("affy_hta_2_0",
                                           "ensembl_gene_id",
                                           "chromosome_name",
                                           "start_position",
                                           "end_position",
                                           "strand",
                                           "entrezgene",
                                           "hgnc_symbol"),
                            filters = "affy_hta_2_0",
                            values = gsub("\\.1", "", probes),
                            mart = ensembl)
Batch submitting query [==-----------------------------------------------------------------------------------------------]   2% eta: 40m

> sessionInfo()
R version 3.4.4 (2018-03-15)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS High Sierra 10.13.5

Matrix products: default
BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.4/Resources/lib/libRlapack.dylib

locale:
[1] fr_FR.UTF-8/fr_FR.UTF-8/fr_FR.UTF-8/C/fr_FR.UTF-8/fr_FR.UTF-8

attached base packages:
[1] stats4    parallel  stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] rtracklayer_1.38.3                      EnsDb.Hsapiens.v75_2.99.0               ensembldb_2.2.2                        
 [4] AnnotationFilter_1.2.0                  TxDb.Hsapiens.UCSC.hg19.knownGene_3.2.2 GenomicFeatures_1.30.3                 
 [7] AnnotationDbi_1.40.0                    GenomicRanges_1.30.3                    GenomeInfoDb_1.14.0                    
[10] biomaRt_2.36.1                          pd.hta.2.0_3.12.2                       DBI_1.0.0                              
[13] RSQLite_2.1.1                           oligo_1.42.0                            Biostrings_2.46.0                      
[16] XVector_0.18.0                          IRanges_2.12.0                          S4Vectors_0.16.0                       
[19] Biobase_2.38.0                          oligoClasses_1.40.0                     BiocGenerics_0.24.0                    

loaded via a namespace (and not attached):
 [1] Rcpp_0.12.17                  lattice_0.20-35               prettyunits_1.0.2             Rsamtools_1.30.0             
 [5] assertthat_0.2.0              digest_0.6.15                 foreach_1.4.4                 mime_0.5                     
 [9] R6_2.2.2                      httr_1.3.1                    BiocInstaller_1.28.0          zlibbioc_1.24.0              
[13] progress_1.1.2                curl_3.2                      lazyeval_0.2.1                blob_1.1.1                   
[17] Matrix_1.2-14                 preprocessCore_1.40.0         splines_3.4.4                 RMySQL_0.10.15               
[21] BiocParallel_1.12.0           AnnotationHub_2.10.1          stringr_1.3.1                 ProtGenerics_1.10.0          
[25] RCurl_1.95-4.10               bit_1.1-14                    shiny_1.1.0                   DelayedArray_0.4.1           
[29] compiler_3.4.4                httpuv_1.4.3                  pkgconfig_2.0.1               htmltools_0.3.6              
[33] SummarizedExperiment_1.8.1    GenomeInfoDbData_1.0.0        interactiveDisplayBase_1.16.0 ff_2.2-14                    
[37] codetools_0.2-15              matrixStats_0.53.1            XML_3.98-1.11                 later_0.7.2                  
[41] GenomicAlignments_1.14.2      bitops_1.0-6                  grid_3.4.4                    xtable_1.8-2                 
[45] magrittr_1.5                  stringi_1.2.2                 promises_1.0.1                affyio_1.48.0                
[49] iterators_1.0.9               tools_3.4.4                   bit64_0.9-7                   yaml_2.1.19                  
[53] memoise_1.1.0                 affxparser_1.50.0   

Is there anything I can do to make the annotation process faster with the last release of biomaRt ?

Many thanks for your advices !

Cheers ! 

-Pef-

 

biomart • 613 views
ADD COMMENT
0
Entering edit mode

How old was your previous version?  Did it used to print the 'batch submitting' message?  The previous versions that didn't use the batch submission method where faster, but had a tendency to silently drop results when you queried for more than 500 values at a time.

ADD REPLY

Login before adding your answer.

Traffic: 544 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6