Question: Issue with TCGAbiolinks package.
0
gravatar for salvocomplicazioni1
10 weeks ago by
salvocomplicazioni10 wrote:

I'm trying to download copy number variation file with TCGAbiolinks package and following code:


library(TCGAbiolinks)
Tumor <- c("BLCA)
query <- GDCquery(project = paste("TCGA-",Tumor, sep = ""),
                  data.category = "Copy Number Variation",
                  data.type = "Gene Level Copy Number Scores",              
                  access="open")
GDCdownload(query, directory = "/tank/home/SIG/")
data <- GDCprepare(query, directory = "/tank/home/SIG/")

but when i run GDCprepare I get an error:

Reading GISTIC file
Parsed with column specification:
cols(
  .default = col_double(),
  `Gene Symbol` = col_character(),
  Cytoband = col_character()
)
See spec(...) for full column specifications.
Error in stri_split_regex(string, pattern, n = n, simplify = simplify,  : 
  oggetto "res" non trovato
tcga gdcprepare • 102 views
ADD COMMENTlink modified 10 weeks ago • written 10 weeks ago by salvocomplicazioni10

Can you update your post to include the output of sessionInfo() so we can see which versions of R and TCGAbiolinks you're using. I get errors earlier in your example code with my setup:

> library(TCGAbiolinks)
> Tumor <- c("BLCA")
> query <- GDCquery(project = paste("TCGA-",Tumor, sep = ""),
+                   data.category = "Copy Number Variation",
+                   data.type = "Gene Level Copy Number Scores",              
+                   access="open")
--------------------------------------
o GDCquery: Searching in GDC database
--------------------------------------
Genome of reference: hg38


|sort(harmonized.data.type)        |
|:---------------------------------|
|Biospecimen Supplement            |
|Clinical Supplement               |
|Copy Number Segment               |
|Gene Expression Quantification    |
|Isoform Expression Quantification |
|Masked Copy Number Segment        |
|Masked Somatic Mutation           |
|miRNA Expression Quantification   |
Error in checkDataTypeInput(legacy = legacy, data.type = data.type) : 
  Please set a data.type argument from the column harmonized.data.type above
> sessionInfo()
R version 3.5.1 (2018-07-02)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: CentOS Linux 7 (Core)

Matrix products: default
BLAS/LAPACK: /g/easybuild/x86_64/CentOS/7/haswell/software/OpenBLAS/0.2.20-GCC-6.4.0-2.28/lib/libopenblas_haswellp-r0.2.20.so

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C               LC_TIME=en_US.UTF-8       
 [4] LC_COLLATE=en_US.UTF-8     LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                  LC_ADDRESS=C              
[10] LC_TELEPHONE=C             LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] TCGAbiolinks_2.8.3

loaded via a namespace (and not attached):
  [1] colorspace_1.3-2            selectr_0.4-1               rjson_0.2.20               
  [4] hwriter_1.3.2               circlize_0.4.4              XVector_0.22.0             
  [7] GenomicRanges_1.34.0        GlobalOptions_0.1.0         rstudioapi_0.9.0           
... 
ADD REPLYlink written 10 weeks ago by Mike Smith3.4k
> sessionInfo()
R version 3.5.1 (2018-07-02)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 16.04.5 LTS

Matrix products: default
BLAS: /usr/lib/libblas/libblas.so.3.6.0
LAPACK: /usr/lib/lapack/liblapack.so.3.6.0

locale:
 [1] LC_CTYPE=it_IT.UTF-8       LC_NUMERIC=C               LC_TIME=it_IT.UTF-8        LC_COLLATE=it_IT.UTF-8     LC_MONETARY=it_IT.UTF-8    LC_MESSAGES=it_IT.UTF-8   
 [7] LC_PAPER=it_IT.UTF-8       LC_NAME=C                  LC_ADDRESS=C               LC_TELEPHONE=C             LC_MEASUREMENT=it_IT.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] parallel  stats4    stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] reshape2_1.4.3                         BSgenome.Hsapiens.NCBI.GRCh38_1.3.1000 DT_0.5                                 TCGAbiolinks_2.10.2                   
 [5] BSgenome.Mmusculus.UCSC.mm9_1.4.0      BSgenome_1.50.0                        rtracklayer_1.42.1                     Biostrings_2.50.2                     
 [9] XVector_0.22.0                         deconstructSigs_1.8.0                  reprex_0.2.1                           calibrate_1.7.2                       
[13] MASS_7.3-51.1                          RColorBrewer_1.1-2                     bindrcpp_0.2.2                         pathview_1.22.1                       
[17] org.Hs.eg.db_3.7.0                     AnnotationDbi_1.44.0                   shinyBS_0.61                           shinydashboard_0.7.1                  
[21] shinyjs_1.0                            jsonlite_1.6                           shiny_1.2.0                            plotly_4.8.0                          
[25] debrowser_1.10.9                       biomaRt_2.38.0                         CEMiTool_1.6.10                        forcats_0.3.0                         
[29] stringr_1.3.1                          dplyr_0.7.8                            purrr_0.3.0                            readr_1.3.1                           
[33] tidyr_0.8.2                            tibble_2.0.1                           tidyverse_1.2.1                        genefilter_1.64.0                     
[37] hexbin_1.27.2                          splitstackshape_1.4.6                  vsn_3.50.0                             gplots_3.0.1.1                        
[41] ggplot2_3.1.0                          magrittr_1.5                           DESeq2_1.22.2                          SummarizedExperiment_1.12.0           
[45] DelayedArray_0.8.0                     BiocParallel_1.16.5                    matrixStats_0.54.0                     Biobase_2.42.0                        
[49] GenomicRanges_1.34.0                   GenomeInfoDb_1.18.1                    IRanges_2.16.0                         S4Vectors_0.20.1                      
[53] BiocGenerics_0.28.0                    R.utils_2.7.0                          R.oo_1.22.0                            R.methodsS3_1.7.1                     

loaded via a namespace (and not attached):
  [1] statnet.common_4.2.0        Hmisc_4.2-0                 class_7.3-15                Rsamtools_1.34.1            foreach_1.4.4              
  [6] crayon_1.3.4                nlme_3.1-137                backports_1.1.3             sva_3.30.1                  colourpicker_1.0           
 [11] impute_1.56.0               GOSemSim_2.8.0              rlang_0.3.1                 readxl_1.2.0                intergraph_2.0-2           
 [16] limma_3.38.3                rjson_0.2.20                cmprsk_2.2-7                bit64_0.9-7                 glue_1.3.0                 
 [21] trimcluster_0.1-2.1         UpSetR_1.3.3                DOSE_3.8.2                  haven_2.0.0                 tidyselect_0.2.5           
 [26] km.ci_0.5-2                 XML_3.98-1.16               zoo_1.8-4                   ggpubr_0.2                  GenomicAlignments_1.18.1   
 [31] org.Mm.eg.db_3.7.0          xtable_1.8-3                evaluate_0.12               cli_1.0.1                   zlibbioc_1.28.0            
 [36] hwriter_1.3.2               rstudioapi_0.9.0            miniUI_0.1.1.1              whisker_0.3-2               gRbase_1.8-3.4             
 [41] rpart_4.1-13                fastmatch_1.1-0             xfun_0.4                    cluster_2.0.7-1               BiocManager_1.30.4            

ADD REPLYlink modified 10 weeks ago • written 10 weeks ago by salvocomplicazioni10

I found in readGISTIC function:

else if (grepl("Copy Number Variation", query$data.category, 
    ignore.case = TRUE)) {
    if (query$data.type == "Gene Level Copy Number Scores") {
      data <- readGISTIC(files, res$results[[1]]$cases)
    }
    else {
      data <- readCopyNumberVariation(files, query$results[[1]]$cases)
    }
  }

but the following object doesn't seems to exists.

res$results[[1]]$cases

If I modify it as following:

query$results[[1]]$cases

The code works fine. Is it possible?

ADD REPLYlink written 10 weeks ago by salvocomplicazioni10
Answer: C: Issue with TCGAbiolinks package.
1
gravatar for Mike Smith
10 weeks ago by
Mike Smith3.4k
EMBL Heidelberg / de.NBI
Mike Smith3.4k wrote:

It looks like this was a bug that was fixed a few weeks ago (https://github.com/BioinformaticsFMRP/TCGAbiolinks/commit/8c5449958ad2e1cd300daa033d57d1267e7c9d6d)

I would do a BiocManager::install('TCGAbiolinks') to get version 2.10.3 and try from there.

ADD COMMENTlink written 10 weeks ago by Mike Smith3.4k

Yes, TCGAbiolinks version 2.10.3 fix the issue.

ADD REPLYlink written 10 weeks ago by salvocomplicazioni10
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 165 users visited in the last hour