TCGAbiolinks connection error for GDCquery
Entering edit mode
Last seen 9 weeks ago
University of Edinburgh

Dear all,

I'm using TCGAbiolinks to obtain IDAT files and metadata from the GDC server through GDCquery. This has worked previously but now it doesn't, even though the server status is supposedly OK.

I follow the example to get IDAT files from the legacy GDC server shown on this link:

I have tried running this code several times after logging back in and on different days, with the same outcome. GDCquery commands work for the non-legacy server. Any suggestion about what I might be missing?

Thank you!


[1] "dfa394478bd39c11b89d3819a398898d99575a24"

[1] "Data Release 27.0 - October 29, 2020"

[1] "OK"

[1] "3.0.0"

[1] 1

projects <- TCGAbiolinks:::getGDCprojects()$project_id
projects <- projects[grepl('^TCGA',projects,perl=T)]
for(proj in projects){
+     print(proj)
+     query <- GDCquery(project = proj,
+                       data.category = "Raw microarray data",
+                       data.type = "Raw intensities", 
+                       experimental.strategy = "Methylation array", 
+                       legacy = TRUE,
+                       file.type = ".idat",
+                       platform = "Illumina Human Methylation 450")
+ }
o GDCquery: Searching in GDC database
Genome of reference: hg19
Error in open.connection(con, "rb") : HTTP error 404.

sessionInfo( )
R version 4.0.3 (2020-10-10)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Scientific Linux 7.5 (Nitrogen)

Matrix products: default
BLAS:   /gpfs/igmmfs01/software/pkg/el7/apps/R/4.0.3/lib64/R/lib/
LAPACK: /gpfs/igmmfs01/software/pkg/el7/apps/R/4.0.3/lib64/R/lib/

 [1] LC_CTYPE=en_GB.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_GB.UTF-8        LC_COLLATE=en_GB.UTF-8    
 [7] LC_PAPER=en_GB.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] TCGAbiolinks_2.18.0

loaded via a namespace (and not attached):
 [1] Rcpp_1.0.5                  lattice_0.20-41            
 [3] tidyr_1.1.2                 prettyunits_1.1.1          
 [5] assertthat_0.2.1            digest_0.6.27              
 [7] BiocFileCache_1.14.0        plyr_1.8.6                 
 [9] R6_2.5.0                    GenomeInfoDb_1.26.2        
[11] stats4_4.0.3                RSQLite_2.2.2              
[13] httr_1.4.2                  ggplot2_3.3.3              
[15] pillar_1.4.7                zlibbioc_1.36.0            
[17] rlang_0.4.10                progress_1.2.2             
[19] curl_4.3                    data.table_1.13.6          
[21] blob_1.2.1                  S4Vectors_0.28.1           
[23] R.utils_2.10.1              R.oo_1.24.0                
[25] Matrix_1.2-18               downloader_0.4             
[27] readr_1.4.0                 stringr_1.4.0              
[29] RCurl_1.98-1.2              bit_4.0.4                  
[31] biomaRt_2.46.0              munsell_0.5.0              
[33] DelayedArray_0.16.0         xfun_0.19                  
[35] compiler_4.0.3              pkgconfig_2.0.3            
[37] askpass_1.1                 BiocGenerics_0.36.0        
[39] openssl_1.4.3               tidyselect_1.1.0           
[41] SummarizedExperiment_1.20.0 tibble_3.0.4               
[43] GenomeInfoDbData_1.2.4      IRanges_2.24.1             
[45] matrixStats_0.57.0          XML_3.99-0.5               
[47] crayon_1.3.4                dplyr_1.0.2                
[49] dbplyr_2.0.0                bitops_1.0-6               
[51] R.methodsS3_1.8.1           rappdirs_0.3.1             
[53] grid_4.0.3                  jsonlite_1.7.1             
[55] gtable_0.3.0                lifecycle_0.2.0            
[57] DBI_1.1.0                   magrittr_2.0.1             
[59] scales_1.1.1                TCGAbiolinksGUI.data_1.10.0
[61] stringi_1.5.3               XVector_0.30.0             
[63] xml2_1.3.2                  ellipsis_0.3.1             
[65] generics_0.1.0              vctrs_0.3.6                
[67] tools_4.0.3                 bit64_4.0.5                
[69] Biobase_2.50.0              glue_1.4.2                 
[71] purrr_0.3.4                 hms_0.5.3                  
[73] MatrixGenerics_1.2.0        parallel_4.0.3             
[75] AnnotationDbi_1.52.0        colorspace_2.0-0           
[77] GenomicRanges_1.42.0        rvest_0.3.6                
[79] memoise_1.1.0               knitr_1.30
MethylationArray TCGAbiolinks • 154 views
Entering edit mode
Last seen 9 weeks ago
University of Edinburgh

Not really an answer, but it was working again last week, with the exact same commands and it hasn't worked again since. I assume there is an ongoing issue with the legacy server?

I believe the issue might have been due to me using a computer cluster. Maybe the connection to the legacy server isn't stable using a cluster. So eventually I ran the command on my laptop and transferred the IDAT files from there to the cluster. Not ideal, but it worked for a small dataset.


Login before adding your answer.

Traffic: 268 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6