biomaRT getBM() get stuck
1
0
Entering edit mode
@ariel-simon-24168
Last seen 2.8 years ago
Israel

Hi guys, the getBM function just dosen't work from my RStudio.

This is the code i used - directly from 'biomaRt users guide':


library("biomaRt")
library('Rcpp')
listMarts()
ensembl=useMart("ensembl")
listDatasets(ensembl) #check if hsapiens_gene_ensembl is there

ensembl = useDataset("hsapiens_gene_ensembl",mart=ensembl) 

affyids=c("202763_at","209310_s_at","207500_at")

### HERE IS THE PROBLEM ###

getBM(attributes=c('affy_hg_u133_plus_2', 'entrezgene'), 
      filters = 'affy_hg_u133_plus_2', 
      values = affyids, 
      mart = ensembl)

I don't get any error or response. The Rstudio console just stuck in "thinking" mode for hours until i quit the session. Any idea?

Rstudio Version - 1.3.1093

R version - 4.1.0

Tnx!! Ariel

biomaRt getbm ensembl • 2.0k views
ADD COMMENT
0
Entering edit mode
@james-w-macdonald-5106
Last seen 1 day ago
United States

There's an error - the attribute for NCBI Gene ID is 'entrezgene_id', not 'entrezgene'. You can sometimes get weird things happening when you try to get data from a database on the internet, so it's usually reasonable to try a couple of times. Although you usually get an error if there is a problem. Anyway,

> getBM(c("affy_hg_u133_plus_2","entrezgene_id"), "affy_hg_u133_plus_2", affyids, ensembl)
  affy_hg_u133_plus_2 entrezgene_id
1           202763_at           836
2         209310_s_at           837
3           207500_at           838

> sessionInfo()
R version 4.1.2 (2021-11-01)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 19043)

Matrix products: default

locale:
[1] LC_COLLATE=English_United States.1252 
[2] LC_CTYPE=English_United States.1252   
[3] LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C                          
[5] LC_TIME=English_United States.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] Rcpp_1.0.7     biomaRt_2.50.1

loaded via a namespace (and not attached):
 [1] KEGGREST_1.34.0        progress_1.2.2         tidyselect_1.1.1      
 [4] purrr_0.3.4            vctrs_0.3.8            generics_0.1.1        
 [7] stats4_4.1.2           BiocFileCache_2.2.0    utf8_1.2.2            
[10] blob_1.2.2             XML_3.99-0.8           rlang_0.4.12          
[13] pillar_1.6.4           glue_1.5.1             withr_2.4.3           
[16] DBI_1.1.1              rappdirs_0.3.3         BiocGenerics_0.40.0   
[19] bit64_4.0.5            dbplyr_2.1.1           GenomeInfoDbData_1.2.7
[22] lifecycle_1.0.1        stringr_1.4.0          zlibbioc_1.40.0       
[25] Biostrings_2.62.0      memoise_2.0.1          Biobase_2.54.0        
[28] IRanges_2.28.0         fastmap_1.1.0          GenomeInfoDb_1.30.0   
[31] curl_4.3.2             AnnotationDbi_1.56.2   fansi_0.5.0           
[34] filelock_1.0.2         cachem_1.0.6           S4Vectors_0.32.3      
[37] XVector_0.34.0         bit_4.0.4              hms_1.1.1             
[40] png_0.1-7              digest_0.6.29          stringi_1.7.6         
[43] dplyr_1.0.7            tools_4.1.2            bitops_1.0-7          
[46] magrittr_2.0.1         RCurl_1.98-1.5         RSQLite_2.2.9         
[49] tibble_3.1.6           crayon_1.4.2           pkgconfig_2.0.3       
[52] ellipsis_0.3.2         xml2_1.3.3             prettyunits_1.1.1     
[55] assertthat_0.2.1       httr_1.4.2             R6_2.5.1              
[58] compiler_4.1.2
ADD COMMENT
0
Entering edit mode

Hi James!

Thank you for respoding so fast.

The bad news - I have tried your solution but Rtudio still got stuck.

The good news - I solved it!!

More problem details:

1) I couldn't update my Rcpp package to 1.0.7, it was 1.0.6 version. I also got once in a while the error: "function 'Rcpp_precious_remove' not provided by package 'Rcpp'".

2) Everytime i started Rstudio, this warning appeard: "R graphics engine version 14 is not supported by this version of RStudio. The Plots tab will be disabled until a newer version of RStudio is installed."

Solution details:

1) Uninstall & install Rstudio to newest version (4.1.0 --> 4.1.2)

2) make sure that current Rcpp version is 1.0.7

3) Add "useCache = FALSE" as the code below shows.

Code that works now:

library("biomaRt")
library('Rcpp')
listMarts()
ensembl=useMart("ensembl")
listDatasets(ensembl) #check if hsapiens_gene_ensembl is there

ensembl = useDataset("hsapiens_gene_ensembl",mart=ensembl) 

affyids=c("202763_at","209310_s_at","207500_at")

X <- getBM(attributes=c('affy_hg_u133_plus_2', 'entrezgene_id'), 
      filters = 'affy_hg_u133_plus_2', 
      values = affyids, 
      mart = ensembl, useCache = FALSE)

# X is: 
# affy_hg_u133_plus_2 entrezgene_id
# 1           202763_at           836
# 2         209310_s_at           837
# 3           207500_at           838

Sessioninfo:

> sessionInfo()
R version 4.1.2 (2021-11-01)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 19042)

Matrix products: default

locale:
[1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United States.1252    LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C                           LC_TIME=English_United States.1252    
system code page: 65001

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] Rcpp_1.0.7     biomaRt_2.48.3

loaded via a namespace (and not attached):
 [1] pillar_1.6.4           dbplyr_2.1.1           compiler_4.1.2         GenomeInfoDb_1.30.0    XVector_0.32.0        
 [6] prettyunits_1.1.1      bitops_1.0-7           tools_4.1.2            zlibbioc_1.38.0        progress_1.2.2        
[11] digest_0.6.27          bit_4.0.4              tibble_3.1.2           BiocFileCache_2.2.0    RSQLite_2.2.7         
[16] memoise_2.0.1          lifecycle_1.0.1        pkgconfig_2.0.3        png_0.1-7              rlang_0.4.11          
[21] DBI_1.1.2              rstudioapi_0.13        filelock_1.0.2         curl_4.3.2             fastmap_1.1.0         
[26] GenomeInfoDbData_1.2.7 xml2_1.3.3             dplyr_1.0.6            httr_1.4.2             stringr_1.4.0         
[31] rappdirs_0.3.3         generics_0.1.1         Biostrings_2.60.2      S4Vectors_0.30.0       vctrs_0.3.8           
[36] IRanges_2.26.0         hms_1.1.1              tidyselect_1.1.1       stats4_4.1.2           bit64_4.0.5           
[41] glue_1.4.2             Biobase_2.52.0         R6_2.5.1               fansi_0.5.0            AnnotationDbi_1.56.2  
[46] XML_3.99-0.7           purrr_0.3.4            blob_1.2.2             magrittr_2.0.1         ellipsis_0.3.2        
[51] BiocGenerics_0.40.0    assertthat_0.2.1       KEGGREST_1.34.0        utf8_1.2.1             stringi_1.6.1         
[56] RCurl_1.98-1.4         cachem_1.0.5           crayon_1.4.2

Thanks!!

Ariel

ADD REPLY
0
Entering edit mode

It's strange that the biomaRt cache is causing an issue without printing anything to the screen, but it seems clear that's where the problem lies if useCache = FALSE fixes things. Maybe there's something corrupted in the cache, in which case you can use biomartCacheClear() to delete whatever's there and start again. Hopefully then you won't need to use the cache = FALSE argument.

ADD REPLY

Login before adding your answer.

Traffic: 654 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6