error reading from connection (Error in load(url(con)))
1
0
Entering edit mode
@xiaofeiwang18266-13498
Last seen 22 hours ago
Singapore

Enter the body of text here

Dear community,

I got an error when used TCGAquery_recount2 from TCGAbiolinks. Here are the code and the error as below. I did some search for the error but did not figure out why and how to fix it out. Here is a link related the issue https://support.bioconductor.org/p/125586/ , but I did not find an answer there.

Could you give some clue why does this happen? Thanks a lot!

Lung_recount2 <- TCGAquery_recount2(project = "tcga", tissue = "lung")
downloading Range Summarized Experiment for: lung
Error in load(url(con)) : cannot read from connection
In addition: Warning message:
In load(url(con)) :
  URL 'http://idies.jhu.edu/recount/data/v2/TCGA/rse_gene_lung.Rdata': Timeout of 60 seconds was reached


sessionInfo( )
R version 4.0.3 (2020-10-10)
Platform: x86_64-apple-darwin17.0 (64-bit)
Running under: macOS Mojave 10.14.6

Matrix products: default
BLAS:   /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRblas.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
 [1] grid      parallel  stats4    stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] lsa_0.73.2                  SnowballC_0.7.0             rgr_1.1.15                  fastICA_1.2-2               MASS_7.3-53.1               limma_3.46.0               
 [7] survminer_0.4.9             ggpubr_0.4.0                stringr_1.4.0               survival_3.2-10             pvclust_2.2-0               factoextra_1.0.7           
[13] FactoMineR_2.4              pheatmap_1.0.12             DESeq2_1.30.1               ggplot2_3.3.3               VennDiagram_1.6.20          futile.logger_1.4.3        
[19] cowplot_1.1.1               SummarizedExperiment_1.20.0 Biobase_2.50.0              GenomicRanges_1.42.0        GenomeInfoDb_1.26.4         IRanges_2.24.1             
[25] S4Vectors_0.28.1            BiocGenerics_0.36.0         MatrixGenerics_1.2.1        matrixStats_0.58.0          data.table_1.14.0           dplyr_1.0.5                
[31] plyr_1.8.6                  readxl_1.3.1                TCGAbiolinks_2.18.0        

loaded via a namespace (and not attached):
  [1] backports_1.2.1             BiocFileCache_1.14.0        splines_4.0.3               BiocParallel_1.24.1         digest_0.6.27               htmltools_0.5.1.1          
  [7] fansi_0.4.2                 magrittr_2.0.1              memoise_2.0.0               cluster_2.1.1               remotes_2.2.0               openxlsx_4.2.3             
 [13] readr_1.4.0                 annotate_1.68.0             R.utils_2.10.1              askpass_1.1                 prettyunits_1.1.1           colorspace_2.0-0           
 [19] blob_1.2.1                  rvest_1.0.0                 rappdirs_0.3.3              ggrepel_0.9.1               haven_2.3.1                 xfun_0.22                  
 [25] crayon_1.4.1                RCurl_1.98-1.3              jsonlite_1.7.2              genefilter_1.72.1           zoo_1.8-9                   glue_1.4.2                 
 [31] gtable_0.3.0                zlibbioc_1.36.0             XVector_0.30.0              DelayedArray_0.16.2         car_3.0-10                  abind_1.4-5                
 [37] scales_1.1.1                futile.options_1.0.1        DBI_1.1.1                   rstatix_0.7.0               Rcpp_1.0.6                  xtable_1.8-4               
 [43] progress_1.2.2              flashClust_1.01-2           foreign_0.8-81              bit_4.0.4                   km.ci_0.5-2                 DT_0.17                    
 [49] htmlwidgets_1.5.3           httr_1.4.2                  RColorBrewer_1.1-2          ellipsis_0.3.1              pkgconfig_2.0.3             XML_3.99-0.6               
 [55] R.methodsS3_1.8.1           dbplyr_2.1.0                locfit_1.5-9.4              utf8_1.2.1                  tidyselect_1.1.0            rlang_0.4.10               
 [61] AnnotationDbi_1.52.0        munsell_0.5.0               cellranger_1.1.0            tools_4.0.3                 cachem_1.0.4                downloader_0.4             
 [67] generics_0.1.0              RSQLite_2.2.4               broom_0.7.5                 fastmap_1.1.0               knitr_1.31                  bit64_4.0.5                
 [73] zip_2.1.1                   survMisc_0.5.5              purrr_0.3.4                 formatR_1.8                 R.oo_1.24.0                 leaps_3.1                  
 [79] xml2_1.3.2                  biomaRt_2.46.3              compiler_4.0.3              curl_4.3                    ggsignif_0.6.1              tibble_3.1.0               
 [85] geneplotter_1.68.0          stringi_1.5.3               TCGAbiolinksGUI.data_1.10.0 forcats_0.5.1               lattice_0.20-41             Matrix_1.3-2               
 [91] KMsurv_0.1-5                vctrs_0.3.6                 pillar_1.5.1                lifecycle_1.0.0             BiocManager_1.30.10         bitops_1.0-6               
 [97] R6_2.5.0                    gridExtra_2.3               rio_0.5.26                  lambda.r_1.2.4              assertthat_0.2.1            openssl_1.4.3              
[103] withr_2.4.1                 GenomeInfoDbData_1.2.4      hms_1.0.0                   tidyr_1.1.3                 carData_3.0-4               scatterplot3d_0.3-41
TCGAbiolinks • 116 views
ADD COMMENT
0
Entering edit mode
@kevin
Last seen 43 minutes ago
Ireland, Republic of

This works for me, and also the other command from the previous question. Can you please restart your R Session, your router, and/or your computer, and then try again.

I am using the same version of R as you, but on linux:

R version 4.0.3 (2020-10-10)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 16.04.7 LTS

Matrix products: default
BLAS:   /usr/lib/atlas-base/atlas/libblas.so.3.0
LAPACK: /usr/lib/atlas-base/atlas/liblapack.so.3.0

I am using slightly older versions of the key packages, though:

other attached packages:
 [1] SummarizedExperiment_1.18.2 DelayedArray_0.14.1        
 [3] matrixStats_0.57.0          Biobase_2.48.0             
 [5] GenomicRanges_1.40.0        GenomeInfoDb_1.24.2        
 [7] IRanges_2.22.2              S4Vectors_0.26.1           
 [9] BiocGenerics_0.34.0         TCGAbiolinks_2.16.4

Kevin

ADD COMMENT
1
Entering edit mode

fixed it finally, because of the very poor internet connects

Thank you so much!

ADD REPLY
0
Entering edit mode

Kevin Blighe

But, I got another question. It is not related to this error.

I'd like to use the data from TCGA lung. But, after downloading, do you know how to differentiate the data from LUAD or LUSC? I tried to look the colData by colData(Lung_TCGA_recount2$tcga_lung)$project, but all are TCGA. Thanks a lot!

Lung_TCGA_recount2 <- TCGAquery_recount2(project = "tcga", tissue = "lung")

Lung_TCGA_recount2
$tcga_lung
class: RangedSummarizedExperiment 
dim: 58037 1156 
metadata(0):
assays(1): counts
rownames(58037): ENSG00000000003.14 ENSG00000000005.5 ... ENSG00000283698.1 ENSG00000283699.1
rowData names(3): gene_id bp_length symbol
colnames(1156): 191FE3D1-FEBF-4585-B6F4-263BFAD4DD7E 672AFEB7-E9AA-4A44-AA9C-EF6344AE5C5C ... 244FC1FD-8C72-4581-8BF4-5AA90335024C
  7BECACF1-A35F-427E-8954-073A5518D905
colData names(864): project sample ... xml_primary_pathology_history_myasthenia_gravis xml_primary_pathology_section_myasthenia_gravis
ADD REPLY
1
Entering edit mode

Hey, it seems to be in the following column:

Lung_TCGA_recount2 <- TCGAquery_recount2(project = 'tcga', tissue = 'lung')
table(Lung_TCGA_recount2$tcga_lung@colData$gdc_cases.project.name)

         Lung Adenocarcinoma Lung Squamous Cell Carcinoma 
                         601                          555 

Also take a look at gdc_cases.project.project_id column

ADD REPLY
1
Entering edit mode

Thanks a lot!

ADD REPLY
0
Entering edit mode

Kevin Blighe Sorry, one more question, which one is the information for TCGA sample barcodes, eg. "TCGA-B2-3924-01B-03R-A277-07". How can I get the information in this format?

In fact, "gdc_cases.samples.portions.analytes.aliquots.submitter_id" is the one (in TCGA-62-A471-01A-12R..) with most information that I can find. But, I still want to the full name or barcode in TCGA-LUAD.

Thanks a lot!

ADD REPLY
0
Entering edit mode

I am not sure about that. Perhaps TCGAutils has a way to do it. Otherwise, you could retrieve the biotab metadata from the GDC (for LUAD), import that to R, and then use that

ADD REPLY

Login before adding your answer.

Traffic: 265 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6