Just a comment on the import, it would probably be best to use as="NumericList" for that type of data. It will probably still be pretty big though. You could loop over the chromosomes and use the which argument.
hi, i'm the maintainer of phastCons100way.UCSC.hg19, this package reduces the memory requirements by rounding the phastCons scores to 1-decimal digit, i'm in the process of generating other types of GScores objects, including for instance phyloP scores with a different quantization adapted to this kind of scores, that will become available through the GenomicScores package and the AnnotationHub, if you think this 1-decimal digit rounding is acceptable for your purposes, i could also generate the GScores object for phastCons60way.UCSC.mm10, what do you think?
cheers,
robert.
=====================EDIT==============
hi, a GScores object for phastCons60way.UCSC.mm10 is now available through the devel version of the GenomicScores package as follows:
Hi Robert, the authors of ATACseqQC usethe phastCons100way.UCSC.hg19 library so I think the rounding should not be a problem. Would be great if you could generate phastCons60way.UCSC.mm10. Thank you.
thanks for reporting this problem. We're transitioning the server that hosts genomic scores available through the AnnotationHub from a http:// protocol to a SSL-compliant https:// protocol and it will take a few days until this transition is complete. in the meantime, we've just autogenerated a certificate that seems to make it work at least in the release version of Bioconductor.
so, please try the above with the latest Bioconductor release version of GenomicScores (1.10.0), it seems to work in my computer:
library(GenomicScores)
phast <- getGScores("phastCons60way.UCSC.mm10")
snapshotDate(): 2019-10-29
download 59 resources? [y/n] y
|======================================================================| 100%
[...]
|======================================================================| 100%
loading from cache
phast
GScores object
# organism: Mus musculus (UCSC, mm10)
# provider: UCSC
# provider version: 17Apr2014
# download date: May 24, 2017
# loaded sequences: default
# maximum abs. error: 0.05
# use 'citation()' to cite these data in publications
let me clarify that in my 2+ years old answer above, i asked to use the devel version of Bioconductor because, back in that moment, that was the only way to get immediate access to the just added resources for phastCons60way.UCSC.mm10. However, these are currently available in release and there's no need for you to use the Bioconductor devel version to download these genomic scores.
Oh thanks Robert! I initially thought my error is due to not using devel version so I upgraded R (4.0) and then bioconductor but of course that didn't work :)
Thanks for autogenerating a certificate. I tried downloading with same R and bioconductor version (3.11) and it still giving me the same error. I guess I should downgrade now and then try the download? I am on 1.11.2 for GenomicScores
to use the release version of Bioconductor you should follow the instructions at http://bioconductor.org/install, which implies installing a release version of R (3.6.x) and then following the rest of the instructions.
to check whether you've successfully installed the current release version of Bioconductor (as of December 2019), when you call the function version() from the BiocManager package, you should get 3.10:
BiocManager::version()
[1] ‘3.10’
then you may proceed to install the current release version of GenomicScores, which is 1.10.0 by simply doing:
BiocManager::install("GenomicScores")
using this version of GenomicScores i'm able, in my linux box, to successfully download the phastCons60way.UCSC.mm10 score set, as shown above. due to the current problem with the SSL certificate, the development version does not work and that's why my advice for you is to use the release version of GenomicScores. i don't know why, for this particular situation with the temporary autogenerated certificate, the release version works and the development doesn't.
Just a comment on the import, it would probably be best to use
as="NumericList"
for that type of data. It will probably still be pretty big though. You could loop over the chromosomes and use thewhich
argument.