Hello all,
I am encountering problems with the goseq package. I did some analysis few months ago without any problems and I now need to reuse the script for another part of the project. So, I used exactly the same command lines with the new data but I get an error message when launching the nullp
function. So I tried to re-do my old analysis and I have now the same error.
Here is my commands and the error:
pwf.D1 = nullp(D1.vector, 'mm10', 'ensGene') Loading required package: rtracklayer Loading required package: GenomicRanges Can't find mm10/ensGene length data in genLenDataBase... Trying to download from UCSC. This might take a couple of minutes. Erreur dans value[[3L]](cond) : Length information for genome mm10 and gene ID ensGene is not available. You will have to specify bias.data manually. De plus : Messages d'avis : 1: package ‘rtracklayer’ was built under R version 3.1.3 2: package ‘GenomicRanges’ was built under R version 3.1.2
Here is the command I used few months ago with the comments during the analysis was running:
pwf.D1 = nullp(D1.vector, 'mm10', 'ensGene') #Loading required package: rtracklayer #Loading required package: GenomicRanges #Can't find mm10/ensGene length data in genLenDataBase... Trying to download from UCSC. #This might take a couple of minutes. #Message d'avis : #In pcls(G) : initial point very close to some inequality constraints
Because I updated my Bioconductor packages recently (I think I used goseq_1.16.2 before) and I am using an old R version (still the one for Snow Leopard), I am thinking that it could be a problem of version compatibility... Here is my sessionInfo()
:
R version 3.1.1 (2014-07-10) Platform: x86_64-apple-darwin10.8.0 (64-bit) locale: [1] fr_FR.UTF-8/fr_FR.UTF-8/fr_FR.UTF-8/C/fr_FR.UTF-8/fr_FR.UTF-8 attached base packages: [1] parallel stats4 stats graphics grDevices utils datasets methods base other attached packages: [1] limma_3.22.7 rtracklayer_1.26.3 GenomicRanges_1.18.4 goseq_1.18.0 AnnotationDbi_1.28.2 [6] GenomeInfoDb_1.2.5 IRanges_2.0.1 S4Vectors_0.4.0 Biobase_2.26.0 BiocGenerics_0.12.1 [11] RSQLite_1.0.0 DBI_0.3.1 geneLenDataBase_1.1.1 BiasedUrn_1.06.1 loaded via a namespace (and not attached): [1] base64enc_0.1-2 BatchJobs_1.6 BBmisc_1.9 BiocParallel_1.0.3 [5] biomaRt_2.22.0 Biostrings_2.34.1 bitops_1.0-6 brew_1.0-6 [9] checkmate_1.5.3 codetools_0.2-11 colorspace_1.2-6 digest_0.6.8 [13] fail_1.2 foreach_1.4.2 GenomicAlignments_1.2.2 GenomicFeatures_1.18.7 [17] ggplot2_1.0.1 GO.db_3.0.0 grid_3.1.1 gtable_0.1.2 [21] iterators_1.0.7 lattice_0.20-31 magrittr_1.5 MASS_7.3-40 [25] Matrix_1.2-0 mgcv_1.8-6 munsell_0.4.2 nlme_3.1-120 [29] plyr_1.8.2 proto_0.3-10 Rcpp_0.11.6 RCurl_1.95-4.6 [33] reshape2_1.4.1 Rsamtools_1.18.3 scales_0.2.4 sendmailR_1.2-1 [37] stringi_0.4-1 stringr_1.0.0 tools_3.1.1 XML_3.98-1.1 [41] XVector_0.6.0 zlibbioc_1.12.0
So, does anyone has an idea of the origin of the problem? Should I have to reinstall a newer R version or just manually install goseq_1.16.2?
Thank you for your help.
Best,
Nicolas
Hi Nicolas,
The UCSC download uses rtracklayer, so it may be that your issue is arising from it being build under the wrong version of R.
In the newest version of bioconductor (3.1) and goseq (1.20), we've movied towards fetching the gene lengths from TxDb rather than geneLenDataBase. So if you upgrade to R 3.2 and reinstall all the packages plus TxDb.Mmusculus.UCSC.mm10.ensGene, goseq shouldn't need to fetch the gene length from UCSC at all.
Cheers,
Nadia.
Hello Nadia,
I updated my R and goseq versions and it is working now without trying to retrieve informations from UCSC (and transcript lengths are the same as before). Thank you for your help!
Best,
Nicolas