Issue with TCGAbiolinks package.
1
0
Entering edit mode
@salvocomplicazioni1-14125
Last seen 11 months ago
Germany

I'm trying to download copy number variation file with TCGAbiolinks package and following code:


library(TCGAbiolinks)
Tumor <- c("BLCA)
query <- GDCquery(project = paste("TCGA-",Tumor, sep = ""),
                  data.category = "Copy Number Variation",
                  data.type = "Gene Level Copy Number Scores",              
                  access="open")
GDCdownload(query, directory = "/tank/home/SIG/")
data <- GDCprepare(query, directory = "/tank/home/SIG/")

but when i run GDCprepare I get an error:

Reading GISTIC file
Parsed with column specification:
cols(
  .default = col_double(),
  `Gene Symbol` = col_character(),
  Cytoband = col_character()
)
See spec(...) for full column specifications.
Error in stri_split_regex(string, pattern, n = n, simplify = simplify,  : 
  oggetto "res" non trovato
TCGA GDCprepare • 1.6k views
ADD COMMENT
0
Entering edit mode

Can you update your post to include the output of sessionInfo() so we can see which versions of R and TCGAbiolinks you're using. I get errors earlier in your example code with my setup:

> library(TCGAbiolinks)
> Tumor <- c("BLCA")
> query <- GDCquery(project = paste("TCGA-",Tumor, sep = ""),
+                   data.category = "Copy Number Variation",
+                   data.type = "Gene Level Copy Number Scores",              
+                   access="open")
--------------------------------------
o GDCquery: Searching in GDC database
--------------------------------------
Genome of reference: hg38


|sort(harmonized.data.type)        |
|:---------------------------------|
|Biospecimen Supplement            |
|Clinical Supplement               |
|Copy Number Segment               |
|Gene Expression Quantification    |
|Isoform Expression Quantification |
|Masked Copy Number Segment        |
|Masked Somatic Mutation           |
|miRNA Expression Quantification   |
Error in checkDataTypeInput(legacy = legacy, data.type = data.type) : 
  Please set a data.type argument from the column harmonized.data.type above
> sessionInfo()
R version 3.5.1 (2018-07-02)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: CentOS Linux 7 (Core)

Matrix products: default
BLAS/LAPACK: /g/easybuild/x86_64/CentOS/7/haswell/software/OpenBLAS/0.2.20-GCC-6.4.0-2.28/lib/libopenblas_haswellp-r0.2.20.so

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C               LC_TIME=en_US.UTF-8       
 [4] LC_COLLATE=en_US.UTF-8     LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                  LC_ADDRESS=C              
[10] LC_TELEPHONE=C             LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] TCGAbiolinks_2.8.3

loaded via a namespace (and not attached):
  [1] colorspace_1.3-2            selectr_0.4-1               rjson_0.2.20               
  [4] hwriter_1.3.2               circlize_0.4.4              XVector_0.22.0             
  [7] GenomicRanges_1.34.0        GlobalOptions_0.1.0         rstudioapi_0.9.0           
... 
ADD REPLY
0
Entering edit mode
> sessionInfo()
R version 3.5.1 (2018-07-02)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 16.04.5 LTS

Matrix products: default
BLAS: /usr/lib/libblas/libblas.so.3.6.0
LAPACK: /usr/lib/lapack/liblapack.so.3.6.0

locale:
 [1] LC_CTYPE=it_IT.UTF-8       LC_NUMERIC=C               LC_TIME=it_IT.UTF-8        LC_COLLATE=it_IT.UTF-8     LC_MONETARY=it_IT.UTF-8    LC_MESSAGES=it_IT.UTF-8   
 [7] LC_PAPER=it_IT.UTF-8       LC_NAME=C                  LC_ADDRESS=C               LC_TELEPHONE=C             LC_MEASUREMENT=it_IT.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] parallel  stats4    stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] reshape2_1.4.3                         BSgenome.Hsapiens.NCBI.GRCh38_1.3.1000 DT_0.5                                 TCGAbiolinks_2.10.2                   
 [5] BSgenome.Mmusculus.UCSC.mm9_1.4.0      BSgenome_1.50.0                        rtracklayer_1.42.1                     Biostrings_2.50.2                     
 [9] XVector_0.22.0                         deconstructSigs_1.8.0                  reprex_0.2.1                           calibrate_1.7.2                       
[13] MASS_7.3-51.1                          RColorBrewer_1.1-2                     bindrcpp_0.2.2                         pathview_1.22.1                       
[17] org.Hs.eg.db_3.7.0                     AnnotationDbi_1.44.0                   shinyBS_0.61                           shinydashboard_0.7.1                  
[21] shinyjs_1.0                            jsonlite_1.6                           shiny_1.2.0                            plotly_4.8.0                          
[25] debrowser_1.10.9                       biomaRt_2.38.0                         CEMiTool_1.6.10                        forcats_0.3.0                         
[29] stringr_1.3.1                          dplyr_0.7.8                            purrr_0.3.0                            readr_1.3.1                           
[33] tidyr_0.8.2                            tibble_2.0.1                           tidyverse_1.2.1                        genefilter_1.64.0                     
[37] hexbin_1.27.2                          splitstackshape_1.4.6                  vsn_3.50.0                             gplots_3.0.1.1                        
[41] ggplot2_3.1.0                          magrittr_1.5                           DESeq2_1.22.2                          SummarizedExperiment_1.12.0           
[45] DelayedArray_0.8.0                     BiocParallel_1.16.5                    matrixStats_0.54.0                     Biobase_2.42.0                        
[49] GenomicRanges_1.34.0                   GenomeInfoDb_1.18.1                    IRanges_2.16.0                         S4Vectors_0.20.1                      
[53] BiocGenerics_0.28.0                    R.utils_2.7.0                          R.oo_1.22.0                            R.methodsS3_1.7.1                     

loaded via a namespace (and not attached):
  [1] statnet.common_4.2.0        Hmisc_4.2-0                 class_7.3-15                Rsamtools_1.34.1            foreach_1.4.4              
  [6] crayon_1.3.4                nlme_3.1-137                backports_1.1.3             sva_3.30.1                  colourpicker_1.0           
 [11] impute_1.56.0               GOSemSim_2.8.0              rlang_0.3.1                 readxl_1.2.0                intergraph_2.0-2           
 [16] limma_3.38.3                rjson_0.2.20                cmprsk_2.2-7                bit64_0.9-7                 glue_1.3.0                 
 [21] trimcluster_0.1-2.1         UpSetR_1.3.3                DOSE_3.8.2                  haven_2.0.0                 tidyselect_0.2.5           
 [26] km.ci_0.5-2                 XML_3.98-1.16               zoo_1.8-4                   ggpubr_0.2                  GenomicAlignments_1.18.1   
 [31] org.Mm.eg.db_3.7.0          xtable_1.8-3                evaluate_0.12               cli_1.0.1                   zlibbioc_1.28.0            
 [36] hwriter_1.3.2               rstudioapi_0.9.0            miniUI_0.1.1.1              whisker_0.3-2               gRbase_1.8-3.4             
 [41] rpart_4.1-13                fastmatch_1.1-0             xfun_0.4                    cluster_2.0.7-1               BiocManager_1.30.4            

ADD REPLY
0
Entering edit mode

I found in readGISTIC function:

else if (grepl("Copy Number Variation", query$data.category, 
    ignore.case = TRUE)) {
    if (query$data.type == "Gene Level Copy Number Scores") {
      data <- readGISTIC(files, res$results[[1]]$cases)
    }
    else {
      data <- readCopyNumberVariation(files, query$results[[1]]$cases)
    }
  }

but the following object doesn't seems to exists.

res$results[[1]]$cases

If I modify it as following:

query$results[[1]]$cases

The code works fine. Is it possible?

ADD REPLY
1
Entering edit mode
Mike Smith ★ 6.5k
@mike-smith
Last seen 6 hours ago
EMBL Heidelberg

It looks like this was a bug that was fixed a few weeks ago (https://github.com/BioinformaticsFMRP/TCGAbiolinks/commit/8c5449958ad2e1cd300daa033d57d1267e7c9d6d)

I would do a BiocManager::install('TCGAbiolinks') to get version 2.10.3 and try from there.

ADD COMMENT
0
Entering edit mode

Yes, TCGAbiolinks version 2.10.3 fix the issue.

ADD REPLY

Login before adding your answer.

Traffic: 513 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6