Problem downloading TCGA projects with the method "client"
0
0
Entering edit mode
sann • 0
@sann-23188
Last seen 4.7 years ago
Netherlands

Hi, I am trying to download TCGA projects with the GDCdownload function. I get an error when using the 'client' method, the 'api' method works fine. But I would really like to use the "client" method for the downloading of the TCGA data, because the "client" method should be more stable than the "api" method. Below I inserted the code to reproduce the error I encountered.

The following code is copied from the terminal:

> library(TCGAbiolinks)
> query <-
+         GDCquery(
+             project = "TCGA-ESCA",
+             data.category = "Transcriptome Profiling",
+             data.type = "Gene Expression Quantification",
+             workflow.type = "HTSeq - Counts"
+         )
--------------------------------------
o GDCquery: Searching in GDC database
--------------------------------------
Genome of reference: hg38
--------------------------------------------
oo Accessing GDC. This might take a while...
--------------------------------------------
ooo Project: TCGA-ESCA
--------------------
oo Filtering results
--------------------
ooo By data.type
ooo By workflow.type
----------------
oo Checking data
----------------
ooo Check if there are duplicated cases
ooo Check if there results for the query
-------------------
o Preparing output
-------------------
> GDCdownload(query, method = "client")
Downloading data for project TCGA-ESCA
trying URL 'https://gdc.cancer.gov/files/public/file/gdc-client_v1.5.0_Windows_x64_0.zip'
Content type 'application/zip' length 15221595 bytes (14.5 MB)
downloaded 14.5 MB

Error in unzip(basename(bin)) : invalid zip name argument
In addition: Warning message:
In if (grepl("^https?://", url)) { :
  the condition has length > 1 and only the first element will be used

And then the script breaks, the TCGA project data is not downloaded and cannot be worked with. If I execute the same code but use method = "api", the script does work, but the "api" method is more unstable.

I also inserted the output from the terminal when using the sessionInfo() command:

> sessionInfo()
R version 3.6.2 (2019-12-12)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 18363)

Matrix products: default

locale:
[1] LC_COLLATE=English_United States.1252
[2] LC_CTYPE=English_United States.1252
[3] LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C
[5] LC_TIME=English_United States.1252

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

other attached packages:
[1] TCGAbiolinks_2.15.3

loaded via a namespace (and not attached):
  [1] pkgcond_0.1.0               colorspace_1.4-1
  [3] ggsignif_0.6.0              selectr_0.4-2
  [5] hwriter_1.3.2               testextra_0.1.0.1
  [7] XVector_0.26.0              GenomicRanges_1.38.0
  [9] ggpubr_0.2.5                ggrepel_0.8.1
 [11] bit64_0.9-7                 AnnotationDbi_1.48.0
 [13] xml2_1.2.2                  codetools_0.2-16
 [15] splines_3.6.2               R.methodsS3_1.8.0
 [17] doParallel_1.0.15           DESeq_1.38.0
 [19] geneplotter_1.64.0          knitr_1.28
 [21] jsonlite_1.6.1              Rsamtools_2.2.3
 [23] km.ci_0.5-2                 broom_0.5.5
 [25] annotate_1.64.0             dbplyr_1.4.2
 [27] png_0.1-7                   R.oo_1.23.0
 [29] readr_1.3.1                 compiler_3.6.2
 [31] httr_1.4.1                  backports_1.1.5
 [33] assertthat_0.2.1            Matrix_1.2-18
 [35] limma_3.42.2                prettyunits_1.1.1
 [37] tools_3.6.2                 gtable_0.3.0
 [39] glue_1.3.1                  GenomeInfoDbData_1.2.2
 [41] dplyr_0.8.4                 ggthemes_4.2.0
 [43] rappdirs_0.3.1              ShortRead_1.44.3
 [45] Rcpp_1.0.3                  Biobase_2.46.0
 [47] vctrs_0.2.3                 Biostrings_2.54.0
 [49] nlme_3.1-144                rtracklayer_1.46.0
 [51] iterators_1.0.12            xfun_0.12
 [53] stringr_1.4.0               testthat_2.3.1
 [55] rvest_0.3.5                 lifecycle_0.1.0
 [57] XML_3.99-0.3                edgeR_3.28.1
 [59] zoo_1.8-7                   postlogic_0.1.0.1
 [61] zlibbioc_1.32.0             scales_1.1.0
 [63] aroma.light_3.16.0          hms_0.5.3
 [65] parallel_3.6.2              SummarizedExperiment_1.16.1
 [67] RColorBrewer_1.1-2          curl_4.3
 [69] memoise_1.1.0               gridExtra_2.3
 [71] KMsurv_0.1-5                ggplot2_3.3.0
 [73] downloader_0.4              biomaRt_2.42.0
 [75] latticeExtra_0.6-29         stringi_1.4.6
 [77] RSQLite_2.2.0               genefilter_1.68.0
 [79] S4Vectors_0.24.3            foreach_1.4.8
 [81] GenomicFeatures_1.38.2      BiocGenerics_0.32.0
 [83] BiocParallel_1.20.1         GenomeInfoDb_1.22.0
 [85] rlang_0.4.4                 pkgconfig_2.0.3
 [87] matrixStats_0.55.0          bitops_1.0-6
 [89] lattice_0.20-38             purrr_0.3.3
 [91] GenomicAlignments_1.22.1    bit_1.1-15.2
 [93] tidyselect_1.0.0            plyr_1.8.5
 [95] magrittr_1.5                R6_2.4.1
 [97] IRanges_2.20.2              generics_0.0.2
 [99] DelayedArray_0.12.2         DBI_1.1.0
[101] mgcv_1.8-31                 pillar_1.4.3
[103] survival_3.1-8              RCurl_1.98-1.1
[105] tibble_2.1.3                EDASeq_2.20.0
[107] crayon_1.3.4                survMisc_0.5.5
[109] purrrogress_0.1.1           BiocFileCache_1.10.2
[111] jpeg_0.1-8.1                progress_1.2.2
[113] locfit_1.5-9.1              grid_3.6.2
[115] sva_3.34.0                  data.table_1.12.8
[117] blob_1.2.1                  digest_0.6.25
[119] xtable_1.8-4                tidyr_1.0.2
[121] R.utils_2.9.2               openssl_1.4.1
[123] stats4_3.6.2                munsell_0.5.0
[125] survminer_0.4.6             parsetools_0.1.2
[127] askpass_1.1

Could someone please tell me what I am doing wrong and how I can fix this?

Thanks a lot!

bioconductor GDCdownload • 1.1k views
ADD COMMENT
1
Entering edit mode

Just to add, the exact same problem also persists on macOS Catalina 10.15.4 (R 3.6.3, platform x86_64-apple-darwin15.6.0 (64-bit)). I suggest you use the "api" method and download files in chunks, as that seems to work like a charm.

ADD REPLY

Login before adding your answer.

Traffic: 650 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6