Error when downloading data with recount3.
3
1
Entering edit mode
Zach Brehm ▴ 20
@brehmzp
Last seen 4.1 years ago
Rochester, NY

I'm attempting to download some data from GTEx with recount3 and I'm getting an invalid description error when the download begins, as well as a warning that one of the URLs being referenced is not available.

> library(recount3)
> human_projects <- available_projects()
2020-10-29 09:52:51 caching file sra.recount_project.MD.gz.
2020-10-29 09:52:51 caching file gtex.recount_project.MD.gz.
2020-10-29 09:52:51 caching file tcga.recount_project.MD.gz.
> proj_info <- subset(human_projects,
+                     project == "BLOOD_VESSEL" & project_type == "data_sources"
+                     )
> rse_blood_vessel <- create_rse(proj_info)
2020-10-29 09:52:54 downloading and reading the metadata.
2020-10-29 09:52:54 caching file gtex.gtex.BLOOD_VESSEL.MD.gz.
2020-10-29 09:52:54 caching file gtex.recount_project.BLOOD_VESSEL.MD.gz.
2020-10-29 09:52:55 caching file gtex.recount_qc.BLOOD_VESSEL.MD.gz.
2020-10-29 09:52:55 caching file gtex.recount_seq_qc.BLOOD_VESSEL.MD.gz.
2020-10-29 09:52:55 downloading and reading the feature information.
2020-10-29 09:52:55 caching file human.gene_sums.G026.gtf.gz.
2020-10-29 09:52:56 downloading and reading the counts: 1398 samples across 63856 features.
Error in file(file, "rt") : invalid 'description' argument
In addition: Warning message:
The 'url' <http://idies.jhu.edu/recount3/data/human/data_sources/gtex/gene_sums/EL/BLOOD_VESSEL/gtex.gene_sums.BLOOD_VESSEL.G026.gz> does not exist or is not available. 
> sessionInfo()
R version 4.0.3 (2020-10-10)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Pop!_OS 20.04 LTS

Matrix products: default
BLAS:   /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.9.0
LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.9.0

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C               LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8     LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8    LC_PAPER=en_US.UTF-8      
 [8] LC_NAME=C                  LC_ADDRESS=C               LC_TELEPHONE=C             LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] parallel  stats4    stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] recount3_1.0.0              SummarizedExperiment_1.20.0 Biobase_2.50.0              GenomicRanges_1.42.0        GenomeInfoDb_1.26.0         IRanges_2.24.0              S4Vectors_0.28.0           
 [8] BiocGenerics_0.36.0         MatrixGenerics_1.2.0        matrixStats_0.57.0         

loaded via a namespace (and not attached):
 [1] tidyselect_1.1.0         purrr_0.3.4              lattice_0.20-41          vctrs_0.3.4              generics_0.0.2           BiocFileCache_1.14.0     rtracklayer_1.50.0      
 [8] blob_1.2.1               XML_3.99-0.5             rlang_0.4.8              R.oo_1.24.0              pillar_1.4.6             withr_2.3.0              glue_1.4.2              
[15] DBI_1.1.0                R.utils_2.10.1           BiocParallel_1.24.0      rappdirs_0.3.1           bit64_4.0.5              dbplyr_1.4.4             sessioninfo_1.1.1       
[22] GenomeInfoDbData_1.2.4   lifecycle_0.2.0          zlibbioc_1.36.0          Biostrings_2.58.0        R.methodsS3_1.8.1        memoise_1.1.0            curl_4.3                
[29] fansi_0.4.1              Rcpp_1.0.5               DelayedArray_0.16.0      XVector_0.30.0           bit_4.0.4                Rsamtools_2.6.0          digest_0.6.27           
[36] dplyr_1.0.2              grid_4.0.3               cli_2.1.0                tools_4.0.3              bitops_1.0-6             magrittr_1.5             RCurl_1.98-1.2          
[43] tibble_3.0.4             RSQLite_2.2.1            crayon_1.3.4             pkgconfig_2.0.3          ellipsis_0.3.1           Matrix_1.2-18            data.table_1.13.2       
[50] assertthat_0.2.1         httr_1.4.2               rstudioapi_0.11          R6_2.5.0                 GenomicAlignments_1.26.0 compiler_4.0.3
recount3 • 2.5k views
ADD COMMENT
1
Entering edit mode

Hi Zach,

Thanks for reporting this. We'll look into it.

Best, Leo

ADD REPLY
1
Entering edit mode

Hi Zach,

We found the issue and are working on resolving it. Basically, we have the data at IDIES already but some files are not in the right location. So the url that recount3 is providing is the correct one (the intended one) but we made a small mistake on the IDIES side that we are fixing right now. We'll let you know once this is resolved.

Best, Leo

ADD REPLY
2
Entering edit mode
@lcolladotor
Last seen 5 days ago
United States

Hi,

Thanks for reporting this issue! It has been resolved now thanks to speedy work by Christopher Wilks. Thanks Chris!

The root problem was that we had uploaded the GTEx data to the IDIES data server but had not moved it to the paths expected by the recount3 R package. You don't need to update the R package, simply, run the commands again.

Best, Leo

ADD COMMENT
2
Entering edit mode

Leo's too kind, the error was all mine to start with.

ADD REPLY
1
Entering edit mode

Hehe, no worries Chris ^^ We all make errors like that all the time :P Thanks again Zach Brehm for reporting the issue.

ADD REPLY
1
Entering edit mode

Thanks Leo and Chris, I just tried it again and everything worked for me. Appreciate the help!

ADD REPLY
1
Entering edit mode
@20ad4ff3
Last seen 2.8 years ago
Poland

I have the same issue when trying to download "exon" data ("gene" data works ok):

> rse = create_rse(subset(human_projects, project=='UVM'), type = "exon", annotation = "gencode_v26")
2022-03-16 13:56:36 downloading and reading the metadata.
2022-03-16 13:56:38 caching file tcga.tcga.UVM.MD.gz.
2022-03-16 13:56:39 caching file tcga.recount_project.UVM.MD.gz.
2022-03-16 13:56:40 caching file tcga.recount_qc.UVM.MD.gz.
2022-03-16 13:56:42 caching file tcga.recount_seq_qc.UVM.MD.gz.
2022-03-16 13:56:44 downloading and reading the feature information.
2022-03-16 13:56:44 caching file human.exon_sums.G026.gtf.gz.
2022-03-16 13:57:06 downloading and reading the counts: 80 samples across 1299686 features.
Error in file(file, "rt") : invalid 'description' argument
In addition: Warning messages:
1: In UseMethod("depth") :
  no applicable method for 'depth' applied to an object of class "NULL"
2: The 'url' <http://duffel.rail.bio/recount3/human/data_sources/tcga/exon_sums/VM/UVM/tcga.exon_sums.UVM.G026.gz> does not exist or is not available. 
ADD COMMENT
1
Entering edit mode
chris.wilks ▴ 70
@chriswilks-20546
Last seen 11 weeks ago
United States

should be fixed now, it was a lingering permissions issue that only affected TCGA exon sums

ADD COMMENT

Login before adding your answer.

Traffic: 513 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6