Question: GDCdownload does not work
0
gravatar for miki716
11 months ago by
miki7160
miki7160 wrote:

Hi, I am trying to use GDCdownload (package TCGAbiolinks) but I am having an error all the time when I try to download my data. I do the query and when I try to download it starts downloading for a while and then stops with a error message. I have tried to do it with other query (even from Bioconductor webpage) and I obtain the same message.

The code is obtained from the paper titled "TCGA Workflow: Analyze cancer genomics and epigenomics data using Bioconductor packages" (Silva).

I paste my code and the messages that I am obtaining:

 

> query.met.gbm=GDCquery(project="TCGA-GBM", legacy=TRUE, data.category="DNA methylation", platform="Illumina Human Methylation 450", barcode=c("TCGA-76-4926-01B-01D-1481-05", "TCGA-28-5211-01C-11D-1844-05"))
--------------------------------------
o GDCquery: Searching in GDC database
--------------------------------------
Genome of reference: hg19
--------------------------------------------
oo Accessing GDC. This might take a while...
--------------------------------------------
[1] "https://api.gdc.cancer.gov/legacy/files/?pretty=true&expand=cases.samples.portions.analytes.aliquots,cases.project,center,analysis,cases.samples&size=988&filters=%7B%22op%22:%22and%22,%22content%22:[%7B%22op%22:%22in%22,%22content%22:%7B%22field%22:%22cases.project.project_id%22,%22value%22:[%22TCGA-GBM%22]%7D%7D,%7B%22op%22:%22in%22,%22content%22:%7B%22field%22:%22files.data_category%22,%22value%22:[%22DNA%20methylation%22]%7D%7D,%7B%22op%22:%22in%22,%22content%22:%7B%22field%22:%22files.platform%22,%22value%22:[%22Illumina%20Human%20Methylation%20450%22]%7D%7D]%7D&format=JSON"
ooo Project: TCGA-GBM
--------------------
oo Filtering results
--------------------
ooo By platform
ooo By barcode
----------------
oo Checking data
----------------
ooo Check if there are duplicated cases
ooo Check if there results for the query
-------------------
o Preparing output
-------------------
> GDCdownload(query.met.gbm)
Downloading data for project TCGA-GBM
GDCdownload will download 2 files. A total of 42.603084 MB
Downloading as: Tue_Dec_18_11_13_35_2018.tar.gz
Downloading: 20 MB     <simpleWarning in file.create(to[okay]): cannot create file 'GDCdata/TCGA-GBM/legacy/DNA_methylation/Methylation_beta_value/0faddb5f-fe60-4269-90cd-736048a5b061/jhu-usc.edu_GBM.HumanMethylation450.6.lvl-3.TCGA-76-4926-01B-01D-1481-05.txt', reason 'No such file or directory'>
<simpleWarning in file.create(to[okay]): cannot create file 'GDCdata/TCGA-GBM/legacy/DNA_methylation/Methylation_beta_value/abb0c4c1-9249-4582-8fa3-34c0a5e3b8e6/jhu-usc.edu_GBM.HumanMethylation450.8.lvl-3.TCGA-28-5211-01C-11D-1844-05.txt', reason 'No such file or directory'>


> sessionInfo()
R version 3.5.1 (2018-07-02)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows >= 8 x64 (build 9200)

Matrix products: default

locale:
[1] LC_COLLATE=Spanish_Spain.1252  LC_CTYPE=Spanish_Spain.1252    LC_MONETARY=Spanish_Spain.1252 LC_NUMERIC=C                   LC_TIME=Spanish_Spain.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] TCGAbiolinks_2.10.0

loaded via a namespace (and not attached):
  [1] colorspace_1.3-2            selectr_0.4-1               rjson_0.2.20                hwriter_1.3.2               circlize_0.4.5              XVector_0.22.0             
  [7] GenomicRanges_1.34.0        GlobalOptions_0.1.0         ggpubr_0.2                  matlab_1.0.2                ggrepel_0.8.0               bit64_0.9-7                
 [13] AnnotationDbi_1.44.0        xml2_1.2.0                  codetools_0.2-15            splines_3.5.1               R.methodsS3_1.7.1           doParallel_1.0.14          
 [19] DESeq_1.34.0                geneplotter_1.60.0          knitr_1.21                  jsonlite_1.6                Rsamtools_1.34.0            km.ci_0.5-2                
 [25] broom_0.5.1                 annotate_1.60.0             cluster_2.0.7-1             R.oo_1.22.0                 readr_1.3.0                 compiler_3.5.1             
 [31] httr_1.4.0                  backports_1.1.3             assertthat_0.2.0            Matrix_1.2-14               lazyeval_0.2.1              limma_3.38.3               
 [37] prettyunits_1.0.2           tools_3.5.1                 bindrcpp_0.2.2              gtable_0.2.0                glue_1.3.0                  GenomeInfoDbData_1.2.0     
 [43] dplyr_0.7.8                 ggthemes_4.0.1              ShortRead_1.40.0            Rcpp_1.0.0                  Biobase_2.42.0              Biostrings_2.50.1          
 [49] nlme_3.1-137                rtracklayer_1.42.1          iterators_1.0.10            xfun_0.4                    stringr_1.3.1               rvest_0.3.2                
 [55] XML_3.98-1.16               edgeR_3.24.2                zoo_1.8-4                   zlibbioc_1.28.0             scales_1.0.0                aroma.light_3.12.0         
 [61] hms_0.4.2                   parallel_3.5.1              SummarizedExperiment_1.12.0 RColorBrewer_1.1-2          curl_3.2                    ComplexHeatmap_1.20.0      
 [67] memoise_1.1.0               gridExtra_2.3               KMsurv_0.1-5                ggplot2_3.1.0               downloader_0.4              biomaRt_2.38.0             
 [73] latticeExtra_0.6-28         stringi_1.2.4               RSQLite_2.1.1               genefilter_1.64.0           S4Vectors_0.20.1            foreach_1.4.4              
 [79] GenomicFeatures_1.34.1      BiocGenerics_0.28.0         BiocParallel_1.16.2         shape_1.4.4                 GenomeInfoDb_1.18.1         rlang_0.3.0.1              
 [85] pkgconfig_2.0.2             matrixStats_0.54.0          bitops_1.0-6                lattice_0.20-35             purrr_0.2.5                 bindr_0.1.1                
 [91] cmprsk_2.2-7                GenomicAlignments_1.18.0    bit_1.1-14                  tidyselect_0.2.5            plyr_1.8.4                  magrittr_1.5               
 [97] R6_2.3.0                    IRanges_2.16.0              generics_0.0.2              DelayedArray_0.8.0          DBI_1.0.0                   mgcv_1.8-24                
[103] pillar_1.3.1                survival_2.42-3             RCurl_1.95-4.11             tibble_1.4.2                EDASeq_2.16.0               crayon_1.3.4               
[109] survMisc_0.5.5              GetoptLong_0.1.7            progress_1.2.0              locfit_1.5-9.1              grid_3.5.1                  sva_3.30.0                 
[115] data.table_1.11.8           blob_1.1.1                  ConsensusClusterPlus_1.46.0 digest_0.6.18               xtable_1.8-3                tidyr_0.8.2                
[121] R.utils_2.7.0               stats4_3.5.1                munsell_0.5.0               survminer_0.4.3  
ADD COMMENTlink modified 10 months ago by Tiago Chedraoui Silva240 • written 11 months ago by miki7160
Answer: GDCdownload does not work
1
gravatar for James W. MacDonald
11 months ago by
United States
James W. MacDonald51k wrote:

So the critical part of the error (which is pretty self explanatory) is this part:

cannot create file '<blah blah blah>' reason 'No such file or directory'

Which means, in this instance, that whatever directory you are specifying doesn't exist, so R can't create a file there. Ideally there would be some error checking that does something like

if(!file.exists(dirname(<some random path name>))) file.create(dirname(<some random path name>))

to ensure that whatever random path you are specifying actually exists first. But failing that, you do get a pretty clear error, IMO.

 

ADD COMMENTlink modified 11 months ago • written 11 months ago by James W. MacDonald51k
Answer: GDCdownload does not work
1
gravatar for Tiago Chedraoui Silva
10 months ago by
Brazil - University of São Paulo/ Los Angeles - Cedars-Sinai Medical Center
Tiago Chedraoui Silva240 wrote:

In the Windows API (with some exceptions discussed in the following paragraphs), the maximum length for a path is MAX_PATH, which is defined as 260 character (https://docs.microsoft.com/en-us/windows/desktop/fileio/naming-a-file#maximum-path-length-limitation)

 

In windows 10 you can enable long path: https://www.howtogeek.com/266621/how-to-make-windows-10-accept-file-paths-over-260-characters/

 

ADD COMMENTlink written 10 months ago by Tiago Chedraoui Silva240

Huh?

> z <- "GDCdata/TCGA-GBM/legacy/DNA_methylation/Methylation_beta_value/0faddb5f-fe60-4269-90cd-736048a5b061/jhu-usc.edu_GBM.HumanMethylation450.6.lvl-3.TCGA-76-4926-01B-01D-1481-05.txt"
> length(strsplit(z, "")[[1]])
[1] 176
ADD REPLYlink written 10 months ago by James W. MacDonald51k

You should consider the full path, not the relative path.

z <- file.path(getwd(),"GDCdata/TCGA-GBM/legacy/DNA_methylation/Methylation_beta_value/0faddb5f-fe60-4269-90cd-736048a5b061/jhu-usc.edu_GBM.HumanMethylation450.6.lvl-3.TCGA-76-4926-01B-01D-1481-05.txt")
length(strsplit(z, "")[[1]])
ADD REPLYlink modified 10 months ago • written 10 months ago by Tiago Chedraoui Silva240

Yes, I saw these and I changed it but I am getting the same result. And anyway with the full path I am getting less than 260 characters so it would not be the problem.

Well, I don't get why, but now it is working. I put it directly on D:\\R and now it works. Thanks!

ADD REPLYlink modified 10 months ago • written 10 months ago by miki7160
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 381 users visited in the last hour