GenomicScores package: availableGScores and getGScores functions not working
1
0
Entering edit mode
nattzy94 ▴ 20
@nattzy94-23466
Last seen 2.1 years ago
Singapore

I am using the GenomicScores package to fetch phyloP100way scores to calculate conservation scores for my sequences.

I ran the availableGScores() function to see which scores are available. However, I get the following message:

Error: failed to load external entity "http://functionalgenomics.upf.edu/annotationhub"

Running getGScores("phastCons100way.UCSC.hg19") results in this error:

Error in UseMethod("filter_") : no applicable method for 'filter_' applied to an object of class "c('tbl_SQLiteConnection', 'tbl_dbi', 'tbl_sql', 'tbl_lazy', 'tbl')" In addition: Warning messages: 1: select_() was deprecated in dplyr 0.7.0. Please use select() instead. This warning is displayed once every 8 hours. Call lifecycle::last_warnings() to see where this warning was generated. 2: filter_() was deprecated in dplyr 0.7.0. Please use filter() instead. See vignette('programming') for more help This warning is displayed once every 8 hours. Call lifecycle::last_warnings() to see where this warning was generated.

Anyone has a work around?

GenomicScores • 1.8k views
ADD COMMENT
0
Entering edit mode
Robert Castelo ★ 3.4k
@rcastelo
Last seen 1 day ago
Barcelona/Universitat Pompeu Fabra

hi,

At this moment (9:30am CET) I can't reproduce this error with the current release version of the package. I have executed the following code without problems:

library(GenomicScores)
avgsco <- availableGScores()
head(avgsco)
                                  Organism      Category Installed Cached
cadd.v1.3.hg19                Homo sapiens Pathogenicity     FALSE  FALSE
fitCons.UCSC.hg19             Homo sapiens    Constraint     FALSE  FALSE
linsight.UCSC.hg19            Homo sapiens    Constraint     FALSE  FALSE
MafDb.1Kgenomes.phase1.GRCh38 Homo sapiens           MAF     FALSE  FALSE
MafDb.1Kgenomes.phase1.hs37d5 Homo sapiens           MAF     FALSE  FALSE
MafDb.1Kgenomes.phase3.GRCh38 Homo sapiens           MAF     FALSE  FALSE
                              BiocManagerInstall AnnotationHub
cadd.v1.3.hg19                             FALSE          TRUE
fitCons.UCSC.hg19                           TRUE         FALSE
linsight.UCSC.hg19                         FALSE          TRUE
MafDb.1Kgenomes.phase1.GRCh38               TRUE         FALSE
MafDb.1Kgenomes.phase1.hs37d5               TRUE         FALSE
MafDb.1Kgenomes.phase3.GRCh38               TRUE         FALSE
gsco <- getGScores("phastCons100way.UCSC.hg19")
gsco
GScores object 
# organism: Homo sapiens (UCSC, hg19)
# provider: UCSC
# provider version: 09Feb2014
# download date: Mar 17, 2017
# loaded sequences: default
# maximum abs. error: 0.05
# use 'citation()' to cite these data in publications
gsco2 <- getGScores("phyloP100way") ## you mention you actually wanted to fetch this one
gsco2
GScores object 
# organism: Homo sapiens (UCSC, hg19)
# provider: UCSC
# provider version: 10Feb2014
# download date: May 12, 2017
# loaded sequences: default
# maximum abs. error: 0.55
# use 'citation()' to cite these data in publications

Please find below the output of my sessionInfo(). Could you post yours?

cheers,

robert.

sessionInfo()
R version 4.1.0 (2021-05-18)
Platform: x86_64-apple-darwin17.0 (64-bit)
Running under: macOS Catalina 10.15.7

Matrix products: default
BLAS:   /Library/Frameworks/R.framework/Versions/4.1/Resources/lib/libRblas.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/4.1/Resources/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] parallel  stats4    stats     graphics  grDevices utils     datasets 
[8] methods   base     

other attached packages:
[1] GenomicScores_2.4.0  GenomicRanges_1.44.0 GenomeInfoDb_1.28.4 
[4] IRanges_2.26.0       S4Vectors_0.30.0     BiocGenerics_0.38.0 
[7] colorout_1.2-2      

loaded via a namespace (and not attached):
 [1] Rcpp_1.0.7                    lattice_0.20-44              
 [3] png_0.1-7                     Biostrings_2.60.2            
 [5] assertthat_0.2.1              digest_0.6.27                
 [7] utf8_1.2.2                    mime_0.11                    
 [9] BiocFileCache_2.0.0           R6_2.5.1                     
[11] RSQLite_2.2.8                 httr_1.4.2                   
[13] pillar_1.6.2                  zlibbioc_1.38.0              
[15] rlang_0.4.11                  curl_4.3.2                   
[17] rstudioapi_0.13               blob_1.2.2                   
[19] Matrix_1.3-4                  AnnotationHub_3.0.1          
[21] RCurl_1.98-1.4                bit_4.0.4                    
[23] shiny_1.6.0                   DelayedArray_0.18.0          
[25] HDF5Array_1.20.0              compiler_4.1.0               
[27] httpuv_1.6.3                  pkgconfig_2.0.3              
[29] htmltools_0.5.2               KEGGREST_1.32.0              
[31] tidyselect_1.1.1              tibble_3.1.4                 
[33] GenomeInfoDbData_1.2.6        interactiveDisplayBase_1.30.0
[35] matrixStats_0.60.1            XML_3.99-0.7                 
[37] fansi_0.5.0                   withr_2.4.2                  
[39] crayon_1.4.1                  dplyr_1.0.7                  
[41] dbplyr_2.1.1                  later_1.3.0                  
[43] rhdf5filters_1.4.0            bitops_1.0-7                 
[45] rappdirs_0.3.3                grid_4.1.0                   
[47] xtable_1.8-4                  lifecycle_1.0.0              
[49] DBI_1.1.1                     magrittr_2.0.1               
[51] cachem_1.0.6                  XVector_0.32.0               
[53] promises_1.2.0.1              ellipsis_0.3.2               
[55] filelock_1.0.2                generics_0.1.0               
[57] vctrs_0.3.8                   Rhdf5lib_1.14.2              
[59] tools_4.1.0                   bit64_4.0.5                  
[61] Biobase_2.52.0                glue_1.4.2                   
[63] purrr_0.3.4                   BiocVersion_3.13.1           
[65] MatrixGenerics_1.4.3          fastmap_1.1.0                
[67] yaml_2.2.1                    AnnotationDbi_1.54.1         
[69] rhdf5_2.36.0                  BiocManager_1.30.16          
[71] memoise_2.0.0                
ADD COMMENT
0
Entering edit mode

Thanks for the quick response. Below is my sessionInfo

R version 3.6.2 (2019-12-12)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS Catalina 10.15.6

Matrix products: default
BLAS:   /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.6/Resources/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] parallel  stats4    stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] GenomicScores_1.10.0 GenomicRanges_1.38.0 GenomeInfoDb_1.22.1  IRanges_2.20.2      
[5] S4Vectors_0.24.4     BiocGenerics_0.32.0  dplyr_1.0.6         

loaded via a namespace (and not attached):
 [1] Rcpp_1.0.6                    lattice_0.20-40               Rsamtools_2.2.3              
 [4] Biostrings_2.54.0             assertthat_0.2.1              digest_0.6.25                
 [7] utf8_1.1.4                    mime_0.9                      BiocFileCache_1.10.2         
[10] R6_2.4.1                      cellranger_1.1.0              RSQLite_2.2.0                
[13] httr_1.4.2                    pillar_1.6.0                  zlibbioc_1.32.0              
[16] rlang_0.4.11                  curl_4.3                      readxl_1.3.1                 
[19] rstudioapi_0.13               blob_1.2.1                    Matrix_1.2-18                
[22] BiocParallel_1.20.1           AnnotationHub_2.18.0          RCurl_1.98-1.3               
[25] bit_1.1-15.2                  DelayedArray_0.12.3           shiny_1.6.0                  
[28] rtracklayer_1.46.0            compiler_3.6.2                httpuv_1.6.1                 
[31] pkgconfig_2.0.3               htmltools_0.5.1.1             SummarizedExperiment_1.16.1  
[34] tidyselect_1.1.1              tibble_3.1.1                  GenomeInfoDbData_1.2.2       
[37] interactiveDisplayBase_1.24.0 matrixStats_0.58.0            XML_3.99-0.3                 
[40] fansi_0.4.1                   crayon_1.4.1                  dbplyr_2.1.1                 
[43] later_1.2.0                   GenomicAlignments_1.22.1      bitops_1.0-6                 
[46] rappdirs_0.3.1                grid_3.6.2                    xtable_1.8-4                 
[49] lifecycle_1.0.0               DBI_1.1.0                     magrittr_2.0.1               
[52] cli_2.5.0                     XVector_0.26.0                promises_1.2.0.1             
[55] ellipsis_0.3.2                generics_0.0.2                vctrs_0.3.8                  
[58] tools_3.6.2                   bit64_0.9-7                   BSgenome_1.54.0              
[61] Biobase_2.46.0                glue_1.4.2                    purrr_0.3.4                  
[64] BiocVersion_3.10.1            fastmap_1.0.1                 yaml_2.2.1                   
[67] AnnotationDbi_1.48.0          BiocManager_1.30.10           memoise_1.1.0
ADD REPLY
0
Entering edit mode

I tried to install the latest version of GenomicScores (2.4.0) by downloading the tar.gz and running install.packages('GenomicScores_2.4.0.tar.gz', repos=NULL, type='source').

However, I get the following error:

Installing package into ‘/Users/nathaniel/Library/R/3.6/library’
(as ‘lib’ is unspecified)
ERROR: dependencies ‘rhdf5’, ‘HDF5Array’ are not available for package ‘GenomicScores’
* removing ‘/Users/nathaniel/Library/R/3.6/library/GenomicScores’
Warning in install.packages :
  installation of package ‘GenomicScores_2.4.0.tar.gz’ had non-zero exit status
ADD REPLY
0
Entering edit mode

hi,

Indeed you were using an outdated version of the package. Regarding how to install current release version, please read first carefully the instructions at https://bioconductor.org/install there you'll see that you need to install first the current release version of R (4.1.x); see https://cran.r-project.org and once you have installed R 4.1.x then read carefully the instructions to install GenomicScores at its Bioconductor landing page at https://bioconductor.org/packages/GenomicScores

ADD REPLY
0
Entering edit mode

EDIT: Ignore this, re-read the warning message in more detail (belatedly). I am now moving the cache.

Thanks for the reply, I have since updated R and GenomicScores successfully. However, when trying to access phastCons46wayPlacental.UCSC.hg19 scores via phastcons46way <- getGScores('phastCons46wayPlacental.UCSC.hg19'), I get the following warning:

snapshotDate(): 2021-05-18
loading from cache
Warning message:
DEPRECATION: As of AnnotationHub (>2.23.2), default caching location has changed.
  Problematic cache: /Users/nathaniel/Library/Caches/AnnotationHub
  See https://bioconductor.org/packages/devel/bioc/vignettes/AnnotationHub/inst/doc/TroubleshootingTheCache.html#default-caching-location-update

I tried to access the scores using gscores(phastcons46way, GRanges(seqnames="chr22", IRanges(start=50967020:50967025, width=1))). This results the following error:

no phastCons46wayPlacental scores for population default in sequence chr22 from GScores object x (phastCons46wayPlacental.UCSC.hg19).
GRanges object with 6 ranges and 1 metadata column:
      seqnames    ranges strand |   default
         <Rle> <IRanges>  <Rle> | <numeric>
  [1]    chr22  50967020      * |        NA
  [2]    chr22  50967021      * |        NA
  [3]    chr22  50967022      * |        NA
  [4]    chr22  50967023      * |        NA
  [5]    chr22  50967024      * |        NA
  [6]    chr22  50967025      * |        NA
  -------
  seqinfo: 93 sequences (1 circular) from Genome Reference Consortium GRCh37 genome
Warning message:
In .scores_snrs(x, ranges, pop, summaryFun, quantized, scores.only,  :
  No dequantization function available for scores population default. Scores will be all NA for this population.
ADD REPLY
0
Entering edit mode

Hi, just in case for future users with a similar problem. The package doesn't seem to work with the resources downloaded from the AnnotationHub into the old cache, so one really needs to move to the new cache location following the instructions given by the warning at:

https://bioconductor.org/packages/devel/bioc/vignettes/AnnotationHub/inst/doc/TroubleshootingTheCache.html#default-caching-location-update

Once that step is done, then it works smoothly:

library(GenomicScores)
phastcons46way <- getGScores('phastCons46wayPlacental.UCSC.hg19')
snapshotDate(): 2021-05-18
loading from cache
gsco <- gscores(phastcons46way, GRanges(seqnames="chr22", IRanges(start=50967020:50967025, width=1)))
gsco
GRanges object with 6 ranges and 1 metadata column:
      seqnames    ranges strand |   default
         <Rle> <IRanges>  <Rle> | <numeric>
  [1]    chr22  50967020      * |       0.2
  [2]    chr22  50967021      * |       0.2
  [3]    chr22  50967022      * |       0.2
  [4]    chr22  50967023      * |       0.2
  [5]    chr22  50967024      * |       0.2
  [6]    chr22  50967025      * |       0.1
  -------
  seqinfo: 93 sequences (1 circular) from Genome Reference Consortium GRCh37 genome
ADD REPLY

Login before adding your answer.

Traffic: 462 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6