clinicalData() is not returning all columns from cBioPortal
1
0
Entering edit mode
@32dfa1e8
Last seen 3.0 years ago
Austria

In the TCGA Breast pan cancer data set there is a clinical annotation column “SUBTYPE” (which contains BRCA_LumA, BRCA_LumB, BRCA_Basal, BRCA_Her2, BRCA_Her2)

My little R script is failing to get this column. How can I get these data?


library(cBioPortalData)
cbio <- cBioPortal()
x <- clinicalData(cbio, "brca_tcga_pan_can_atlas_2018")
x[["SUBTYPE"]]
NULL

> sessionInfo()
R version 4.0.2 (2020-06-22)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Scientific Linux 7.3 (Nitrogen)

Matrix products: default
BLAS/LAPACK: /apps/prod/easybuild/sl7.x86_64/software/OpenBLAS/0.3.9-GCC-9.3.0/lib/libopenblasp-r0.3.9.so

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C               LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8     LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8    LC_PAPER=en_US.UTF-8       LC_NAME=C                  LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] parallel  stats4    stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] cBioPortalData_2.0.10       MultiAssayExperiment_1.14.0 SummarizedExperiment_1.18.2 DelayedArray_0.14.1         matrixStats_0.57.0          Biobase_2.48.0              GenomicRanges_1.40.0        GenomeInfoDb_1.24.2         IRanges_2.22.2             
[10] S4Vectors_0.26.1            BiocGenerics_0.34.0         AnVIL_1.0.3                 dplyr_1.0.4                

loaded via a namespace (and not attached):
 [1] httr_1.4.2                bit64_4.0.5               jsonlite_1.7.1            splines_4.0.2             assertthat_0.2.1          askpass_1.1               TCGAutils_1.8.1           BiocFileCache_1.12.1      blob_1.2.1                Rsamtools_2.4.0          
[11] GenomeInfoDbData_1.2.3    RTCGAToolbox_2.18.0       progress_1.2.2            yaml_2.2.1                pillar_1.5.1              RSQLite_2.2.0             lattice_0.20-41           glue_1.4.2                limma_3.46.0              digest_0.6.25            
[21] XVector_0.28.0            rvest_0.3.6               Matrix_1.3-2              XML_3.99-0.5              pkgconfig_2.0.3           biomaRt_2.44.4            zlibbioc_1.34.0           purrr_0.3.4               RCircos_1.2.1             rapiclient_0.1.3         
[31] BiocParallel_1.22.0       openssl_1.4.2             tibble_3.1.0              generics_0.1.0            ellipsis_0.3.1            GenomicFeatures_1.40.1    survival_3.2-3            RJSONIO_1.3-1.4           magrittr_2.0.1            crayon_1.3.4             
[41] memoise_1.1.0             fansi_0.4.1               xml2_1.3.2                prettyunits_1.1.1         tools_4.0.2               data.table_1.14.0         hms_0.5.3                 formatR_1.7               lifecycle_0.2.0           stringr_1.4.0            
[51] Biostrings_2.56.0         AnnotationDbi_1.50.3      lambda.r_1.2.4            compiler_4.0.2            rlang_0.4.10              futile.logger_1.4.3       debugme_1.1.0             grid_4.0.2                GenomicDataCommons_1.12.0 RCurl_1.98-1.2           
[61] rstudioapi_0.11           rappdirs_0.3.1            bitops_1.0-6              DBI_1.1.1                 curl_4.3                  R6_2.4.1                  GenomicAlignments_1.24.0  rtracklayer_1.48.0        bit_4.0.4                 utf8_1.1.4               
[71] futile.options_1.0.1      readr_1.4.0               stringi_1.5.3             RaggedExperiment_1.12.0   Rcpp_1.0.5                vctrs_0.3.6               dbplyr_1.4.4              tidyselect_1.1.0
cBioPortalData • 772 views
ADD COMMENT
1
Entering edit mode

andreas.wernitznig ,

There has been an update to this by Marcel Ramos , the same gentleman who replied below. Try again with cBioPortalData in Bioc-devel (package version 2.13.4). I've incorporated information from SAMPLE_ID from the datasets to map and build SummarizedExperiment objects. Now, you should get an object that looks like the following:

> (mae <- cBioDataPack("brain_cptac_2020"))
A MultiAssayExperiment object of 7 listed
 experiments with user-defined names and respective classes.
 Containing an ExperimentList class object of length 7:
 [1] cna: SummarizedExperiment with 19380 rows and 190 columns
 [2] linear_cna: SummarizedExperiment with 19380 rows and 190 columns
 [3] mrna_seq_v2_rsem_zscores_ref_all_samples: SummarizedExperiment with 18209 rows and 188 columns
 [4] mrna_seq_v2_rsem: SummarizedExperiment with 18209 rows and 188 columns
 [5] mutations: RaggedExperiment with 9951 rows and 200 columns
 [6] protein_quantification_zscores: SummarizedExperiment with 6429 rows and 218 columns
 [7] protein_quantification: SummarizedExperiment with 6429 rows and 218 columns
Functionality:
 experiments() - obtain the ExperimentList instance
 colData() - the primary/phenotype DataFrame
 sampleMap() - the sample coordination DataFrame
 `$`, `[`, `[[` - extract colData columns, subset, or experiment
 *Format() - convert into a long or wide DataFrame
 assays() - convert ExperimentList to a SimpleList of matrices
 exportClass() - save data to flat files

These changes are in the latest version of cBioPortalData in Bioc-devel (package version 2.13.4).

ADD REPLY
0
Entering edit mode
@marcel-ramos-7325
Last seen 17 days ago
United States

Hi Andreas,

Please make sure that you have the latest release version of cBioPortalData 2.2.11 and that BiocManager::valid() == TRUE. You can also remove the cache to re-download the data:

removeDataCache(api = cbio,
    studyId = "brca_tcga_pan_can_atlas_2018",
    dry.run = FALSE
)

<details> <summary> reprex here </summary>

suppressPackageStartupMessages(
    library(cBioPortalData)
)
cbio <- cBioPortal()
x <- clinicalData(cbio, "brca_tcga_pan_can_atlas_2018")
table(x[["SUBTYPE"]])
#> 
#>  BRCA_Basal   BRCA_Her2   BRCA_LumA   BRCA_LumB BRCA_Normal 
#>         171          78         499         197          36
sessionInfo()
#> R version 4.0.5 Patched (2021-03-31 r80179)
#> Platform: x86_64-pc-linux-gnu (64-bit)
#> Running under: Ubuntu 20.10
#> 
#> Matrix products: default
#> BLAS:   /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.9.0
#> LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.9.0
#> 
#> locale:
#>  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
#>  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
#>  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
#>  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
#>  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
#> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
#> 
#> attached base packages:
#> [1] parallel  stats4    stats     graphics  grDevices utils     datasets 
#> [8] methods   base     
#> 
#> other attached packages:
#>  [1] cBioPortalData_2.2.11       MultiAssayExperiment_1.16.0
#>  [3] SummarizedExperiment_1.20.0 Biobase_2.50.0             
#>  [5] GenomicRanges_1.42.0        GenomeInfoDb_1.26.7        
#>  [7] IRanges_2.24.1              S4Vectors_0.28.1           
#>  [9] BiocGenerics_0.36.0         MatrixGenerics_1.2.1       
#> [11] matrixStats_0.58.0          AnVIL_1.2.0                
#> [13] dplyr_1.0.5                
#> 
#> loaded via a namespace (and not attached):
#>  [1] bitops_1.0-6              fs_1.5.0                 
#>  [3] bit64_4.0.5               progress_1.2.2           
#>  [5] httr_1.4.2                GenomicDataCommons_1.14.0
#>  [7] tools_4.0.3               backports_1.2.1          
#>  [9] utf8_1.2.1                R6_2.5.0                 
#> [11] DBI_1.1.1                 withr_2.4.1              
#> [13] tidyselect_1.1.0          prettyunits_1.1.1        
#> [15] TCGAutils_1.10.0          bit_4.0.4                
#> [17] curl_4.3                  compiler_4.0.3           
#> [19] rvest_1.0.0               formatR_1.9              
#> [21] xml2_1.3.2                DelayedArray_0.16.3      
#> [23] rtracklayer_1.50.0        readr_1.4.0              
#> [25] askpass_1.1               rappdirs_0.3.3           
#> [27] rapiclient_0.1.3          RCircos_1.2.1            
#> [29] Rsamtools_2.6.0           stringr_1.4.0            
#> [31] digest_0.6.27             rmarkdown_2.7            
#> [33] XVector_0.30.0            pkgconfig_2.0.3          
#> [35] htmltools_0.5.1.1         styler_1.4.1             
#> [37] dbplyr_2.1.1              fastmap_1.1.0            
#> [39] limma_3.46.0              highr_0.8                
#> [41] rlang_0.4.10              RSQLite_2.2.6            
#> [43] generics_0.1.0            jsonlite_1.7.2           
#> [45] BiocParallel_1.24.1       RCurl_1.98-1.3           
#> [47] magrittr_2.0.1            GenomeInfoDbData_1.2.4   
#> [49] futile.logger_1.4.3       Matrix_1.3-2             
#> [51] Rcpp_1.0.6                fansi_0.4.2              
#> [53] lifecycle_1.0.0           stringi_1.5.3            
#> [55] yaml_2.2.1                RaggedExperiment_1.14.1  
#> [57] RJSONIO_1.3-1.4           zlibbioc_1.36.0          
#> [59] BiocFileCache_1.14.0      grid_4.0.3               
#> [61] blob_1.2.1                crayon_1.4.1             
#> [63] lattice_0.20-41           Biostrings_2.58.0        
#> [65] splines_4.0.3             GenomicFeatures_1.42.3   
#> [67] hms_1.0.0                 knitr_1.32               
#> [69] pillar_1.6.0              biomaRt_2.46.3           
#> [71] futile.options_1.0.1      reprex_2.0.0             
#> [73] XML_3.99-0.6              glue_1.4.2               
#> [75] evaluate_0.14             lambda.r_1.2.4           
#> [77] data.table_1.14.0         vctrs_0.3.7              
#> [79] openssl_1.4.3             purrr_0.3.4              
#> [81] assertthat_0.2.1          cachem_1.0.4             
#> [83] xfun_0.22                 survival_3.2-10          
#> [85] tibble_3.1.0              RTCGAToolbox_2.20.0      
#> [87] GenomicAlignments_1.26.0  AnnotationDbi_1.52.0     
#> [89] memoise_2.0.0             ellipsis_0.3.1

Created on 2021-04-28 by the [reprex package](https://reprex.tidyverse.org) (v2.0.0)

</details>
Best,

Marcel

ADD COMMENT

Login before adding your answer.

Traffic: 511 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6