Error using TCGAbiolinks "Error in fix.by(by.y, y) : 'by' must specify a uniquely valid column"
1
0
Entering edit mode
f.geist ▴ 20
@fgeist-11258
Last seen 6.7 years ago

Hi everybody,

when I try to use TCGAbiolinks I get an error when preparing the Data. It would be awesome if somebody might have an answer for my problem, thank you!

So I have no problem in downloading the Data via GDCdownload, but as soon as I try to prepare the Data ( GDCprepare) to get a summarized Experiment I get this error:

Downloading genome information (try:0) Using: Homo sapiens genes (GRCh37.p13)
Error in fix.by(by.y, y) : 'by' must specify a uniquely valid column

Here is my code that I want to use to download and prepare the expression data of the TCGA KIRC project:

library(TCGAbiolinks)
query <- GDCquery(project = "TCGA-KIRC",
                  legacy = TRUE,
                  data.category = "Gene expression",
                  data.type = "Gene expression quantification",
                  sample.type = "Primary solid Tumor",
                  file.type =  "normalized_results")

GDCdownload(query, method = "api")

data <- GDCprepare(query, save = TRUE,
  save.filename = "Gene_Expression_Quantification.rda",
  remove.files.prepared = TRUE)

 

thank you so much!

Felix

> sessionInfo()
R version 3.4.0 Patched (2017-06-17 r72807)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows >= 8 x64 (build 9200)

Matrix products: default

locale:
[1] LC_COLLATE=German_Germany.1252  LC_CTYPE=German_Germany.1252    LC_MONETARY=German_Germany.1252 LC_NUMERIC=C                   
[5] LC_TIME=German_Germany.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] TCGAbiolinks_2.5.4

loaded via a namespace (and not attached):
  [1] circlize_0.4.0              fastmatch_1.1-0             aroma.light_3.6.0           plyr_1.8.4                 
  [5] igraph_1.0.1                selectr_0.3-1               ConsensusClusterPlus_1.40.0 lazyeval_0.2.0             
  [9] splines_3.4.0               BiocParallel_1.10.1         pathview_1.16.0             GenomeInfoDb_1.12.2        
 [13] ggplot2_2.2.1               digest_0.6.12               foreach_1.4.3               GOSemSim_2.2.0             
 [17] viridis_0.4.0               GO.db_3.4.1                 magrittr_1.5                memoise_1.1.0              
 [21] cluster_2.0.6               doParallel_1.0.10           limma_3.32.2                ComplexHeatmap_1.14.0      
 [25] Biostrings_2.44.1           readr_1.1.1                 annotate_1.54.0             matrixStats_0.52.2         
 [29] R.utils_2.5.0               colorspace_1.3-2            rvest_0.3.2                 ggrepel_0.6.5              
 [33] dplyr_0.7.0                 RCurl_1.95-4.8              jsonlite_1.5                hexbin_1.27.1              
 [37] graph_1.54.0                genefilter_1.58.1           supraHex_1.14.0             zoo_1.8-0                  
 [41] survival_2.41-3             iterators_1.0.8             ape_4.1                     glue_1.1.0                 
 [45] survminer_0.4.0             gtable_0.2.0                zlibbioc_1.22.0             XVector_0.16.0             
 [49] GetoptLong_0.1.6            DelayedArray_0.2.7          kernlab_0.9-25              Rgraphviz_2.20.0           
 [53] shape_1.4.2                 prabclus_2.2-6              BiocGenerics_0.22.0         DEoptimR_1.0-8             
 [57] scales_0.4.1                DOSE_3.2.0                  DESeq_1.28.0                mvtnorm_1.0-6              
 [61] DBI_0.7                     edgeR_3.18.1                ggthemes_3.4.0              Rcpp_0.12.11               
 [65] cmprsk_2.2-7                viridisLite_0.2.0           xtable_1.8-2                foreign_0.8-67             
 [69] matlab_1.0.2                mclust_5.3                  km.ci_0.5-2                 stats4_3.4.0               
 [73] httr_1.2.1                  fgsea_1.2.1                 RColorBrewer_1.1-2          fpc_2.1-10                 
 [77] modeltools_0.2-21           XML_3.98-1.8                R.methodsS3_1.7.1           flexmix_2.3-14             
 [81] nnet_7.3-12                 locfit_1.5-9.1              rlang_0.1.1                 reshape2_1.4.2             
 [85] AnnotationDbi_1.38.1        munsell_0.4.3               tools_3.4.0                 downloader_0.4             
 [89] RSQLite_1.1-2               broom_0.4.2                 stringr_1.2.0               knitr_1.16                 
 [93] robustbase_0.92-7           survMisc_0.5.4              purrr_0.2.2.2               KEGGREST_1.16.0            
 [97] dendextend_1.5.2            EDASeq_2.10.0               nlme_3.1-131                whisker_0.3-2              
[101] R.oo_1.21.0                 KEGGgraph_1.34.0            DO.db_2.9                   xml2_1.1.1                 
[105] biomaRt_2.32.1              compiler_3.4.0              curl_2.6                    png_0.1-7                  
[109] tibble_1.3.3                geneplotter_1.54.0          stringi_1.1.5               GenomicFeatures_1.28.3     
[113] lattice_0.20-35             trimcluster_0.1-2           Matrix_1.2-10               psych_1.7.5                
[117] KMsurv_0.1-5                GlobalOptions_0.0.12        data.table_1.10.4           bitops_1.0-6               
[121] rtracklayer_1.36.3          GenomicRanges_1.28.3        qvalue_2.8.0                R6_2.2.2                   
[125] latticeExtra_0.6-28         hwriter_1.3.2               ShortRead_1.34.0            gridExtra_2.2.1            
[129] IRanges_2.10.2              codetools_0.2-15            MASS_7.3-47                 assertthat_0.2.0           
[133] SummarizedExperiment_1.6.3  rjson_0.2.15                GenomicAlignments_1.12.1    Rsamtools_1.28.0           
[137] mnormt_1.5-5                S4Vectors_0.14.3            GenomeInfoDbData_0.99.0     diptest_0.75-7             
[141] parallel_3.4.0              hms_0.3                     clusterProfiler_3.4.3       grid_3.4.0                 
[145] tidyr_0.6.3                 class_7.3-14                rvcheck_0.0.8               ggpubr_0.1.3               
[149] Biobase_2.36.2       

 

TCGA tcgabiolinks Error • 3.5k views
ADD COMMENT
0
Entering edit mode
f.geist ▴ 20
@fgeist-11258
Last seen 6.7 years ago

So.. Apperently it's an error in the summarizedexperiment generation, as the GDCprepare works if I use the following code:

df <- GDCprepare(query,
                 save=TRUE,
                 save.filename = "Gene_Expression_Quantification.rda",
                 summarizedExperiment = FALSE)

Nevertheless I would like to get a summarizedExperiment, as I would like to combine the expression data easily to the clinical data.

thank you so much for your help!

Felix Geist

 

PhD student dkfz Heidelberg

 

ADD COMMENT
0
Entering edit mode

Hi Felix,

I know it is an old issue that you reported but I am facing the same problem right now. Have you managed to sort it out?

Thanks,

Krzysztof

ADD REPLY

Login before adding your answer.

Traffic: 906 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6