GDCprepare() can never complete request ("Error: cannot allocate vector")
Entering edit mode
aush ▴ 40
Last seen 4 months ago
United States

I'm trying to pull TCGA data as following:

projects <- TCGAbiolinks:::getGDCprojects()$project_id
projects <- projects[grepl('^TCGA',projects,perl=T)]
query <- GDCquery(project = projects,
                  data.category = "Transcriptome Profiling", 
                  data.type = "Gene Expression Quantification", 
                  workflow.type = "HTSeq - Counts")
counts <- GDCprepare(query,save = TRUE, save.filename = "all_tumor_htseq_raw_counts.rda")

The last command shows something like "100% Completed after 19 m " and then either gets "frozen" (i.e. nothing happens but R session is shown as busy), or it gives "Error: cannot allocate vector of size 473 Kb". My system paging file (pagefile.sys) grows to 17 Gb (usual size 8 Gb). Session info below. Would be grateful for any hints!

R version 4.0.2 (2020-06-22)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 18363)

Matrix products: default

[1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United States.1252    LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C                           LC_TIME=English_United States.1252    
system code page: 1251

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] TCGAbiolinks_2.16.3 vautils_0.1.1.102   magrittr_1.5        data.table_1.13.0  

loaded via a namespace (and not attached):
 [1] Rcpp_1.0.5                  lattice_0.20-41             tidyr_1.1.2                 prettyunits_1.1.1          
 [5] assertthat_0.2.1            digest_0.6.25               BiocFileCache_1.12.1        plyr_1.8.6                 
 [9] R6_2.4.1                    GenomeInfoDb_1.24.2         stats4_4.0.2                RSQLite_2.2.0              
[13] httr_1.4.2                  ggplot2_3.3.2               pillar_1.4.6                zlibbioc_1.34.0            
[17] rlang_0.4.7                 progress_1.2.2              curl_4.3                    rstudioapi_0.11            
[21] blob_1.2.1                  S4Vectors_0.26.1            R.utils_2.10.1              R.oo_1.24.0                
[25] Matrix_1.2-18               downloader_0.4              readr_1.3.1                 stringr_1.4.0              
[29] RCurl_1.98-1.2              bit_4.0.4                   biomaRt_2.44.1              munsell_0.5.0              
[33] DelayedArray_0.14.1         xfun_0.16                   compiler_4.0.2              pkgconfig_2.0.3            
[37] askpass_1.1                 BiocGenerics_0.34.0         openssl_1.4.2               tidyselect_1.1.0           
[41] SummarizedExperiment_1.18.2 tibble_3.0.3                GenomeInfoDbData_1.2.3      IRanges_2.22.2             
[45] matrixStats_0.56.0          XML_3.99-0.5                crayon_1.3.4                dplyr_1.0.2                
[49] dbplyr_1.4.4                bitops_1.0-6                R.methodsS3_1.8.1           rappdirs_0.3.1             
[53] grid_4.0.2                  jsonlite_1.7.0              gtable_0.3.0                lifecycle_0.2.0            
[57] DBI_1.1.0                   scales_1.1.1                stringi_1.4.6               XVector_0.28.0             
[61] xml2_1.3.2                  ellipsis_0.3.1              generics_0.0.2              vctrs_0.3.4                
[65] tools_4.0.2                 bit64_4.0.5                 Biobase_2.48.0              glue_1.4.2                 
[69] purrr_0.3.4                 hms_0.5.3                   parallel_4.0.2              AnnotationDbi_1.50.3       
[73] colorspace_1.4-1            GenomicRanges_1.40.0        rvest_0.3.6                 memoise_1.1.0              
[77] knitr_1.29
TCGAbiolinks TCGA • 263 views
Entering edit mode
Last seen 12 months ago
Brazil - University of São Paulo/ Los A…

Please, how much RAM memory do you have in the machine? Very likely it requires more than the machine has. Did you try preparing the datasets separately?

Entering edit mode

I have 16 Gb of RAM. I was suspecting having not enough, but it's strange that the error message says "vector of size 473 Kb" only.

Will try splitting the query.


Login before adding your answer.

Traffic: 248 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6