Search
Question: Error using TCGAbiolinks "Error in fix.by(by.y, y) : 'by' must specify a uniquely valid column"
0
gravatar for f.geist
5 months ago by
f.geist0
f.geist0 wrote:

Hi everybody,

when I try to use TCGAbiolinks I get an error when preparing the Data. It would be awesome if somebody might have an answer for my problem, thank you!

So I have no problem in downloading the Data via GDCdownload, but as soon as I try to prepare the Data ( GDCprepare) to get a summarized Experiment I get this error:

Downloading genome information (try:0) Using: Homo sapiens genes (GRCh37.p13)
Error in fix.by(by.y, y) : 'by' must specify a uniquely valid column

Here is my code that I want to use to download and prepare the expression data of the TCGA KIRC project:

library(TCGAbiolinks)
query <- GDCquery(project = "TCGA-KIRC",
                  legacy = TRUE,
                  data.category = "Gene expression",
                  data.type = "Gene expression quantification",
                  sample.type = "Primary solid Tumor",
                  file.type =  "normalized_results")

GDCdownload(query, method = "api")

data <- GDCprepare(query, save = TRUE,
  save.filename = "Gene_Expression_Quantification.rda",
  remove.files.prepared = TRUE)

 

thank you so much!

Felix

> sessionInfo()
R version 3.4.0 Patched (2017-06-17 r72807)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows >= 8 x64 (build 9200)

Matrix products: default

locale:
[1] LC_COLLATE=German_Germany.1252  LC_CTYPE=German_Germany.1252    LC_MONETARY=German_Germany.1252 LC_NUMERIC=C                   
[5] LC_TIME=German_Germany.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] TCGAbiolinks_2.5.4

loaded via a namespace (and not attached):
  [1] circlize_0.4.0              fastmatch_1.1-0             aroma.light_3.6.0           plyr_1.8.4                 
  [5] igraph_1.0.1                selectr_0.3-1               ConsensusClusterPlus_1.40.0 lazyeval_0.2.0             
  [9] splines_3.4.0               BiocParallel_1.10.1         pathview_1.16.0             GenomeInfoDb_1.12.2        
 [13] ggplot2_2.2.1               digest_0.6.12               foreach_1.4.3               GOSemSim_2.2.0             
 [17] viridis_0.4.0               GO.db_3.4.1                 magrittr_1.5                memoise_1.1.0              
 [21] cluster_2.0.6               doParallel_1.0.10           limma_3.32.2                ComplexHeatmap_1.14.0      
 [25] Biostrings_2.44.1           readr_1.1.1                 annotate_1.54.0             matrixStats_0.52.2         
 [29] R.utils_2.5.0               colorspace_1.3-2            rvest_0.3.2                 ggrepel_0.6.5              
 [33] dplyr_0.7.0                 RCurl_1.95-4.8              jsonlite_1.5                hexbin_1.27.1              
 [37] graph_1.54.0                genefilter_1.58.1           supraHex_1.14.0             zoo_1.8-0                  
 [41] survival_2.41-3             iterators_1.0.8             ape_4.1                     glue_1.1.0                 
 [45] survminer_0.4.0             gtable_0.2.0                zlibbioc_1.22.0             XVector_0.16.0             
 [49] GetoptLong_0.1.6            DelayedArray_0.2.7          kernlab_0.9-25              Rgraphviz_2.20.0           
 [53] shape_1.4.2                 prabclus_2.2-6              BiocGenerics_0.22.0         DEoptimR_1.0-8             
 [57] scales_0.4.1                DOSE_3.2.0                  DESeq_1.28.0                mvtnorm_1.0-6              
 [61] DBI_0.7                     edgeR_3.18.1                ggthemes_3.4.0              Rcpp_0.12.11               
 [65] cmprsk_2.2-7                viridisLite_0.2.0           xtable_1.8-2                foreign_0.8-67             
 [69] matlab_1.0.2                mclust_5.3                  km.ci_0.5-2                 stats4_3.4.0               
 [73] httr_1.2.1                  fgsea_1.2.1                 RColorBrewer_1.1-2          fpc_2.1-10                 
 [77] modeltools_0.2-21           XML_3.98-1.8                R.methodsS3_1.7.1           flexmix_2.3-14             
 [81] nnet_7.3-12                 locfit_1.5-9.1              rlang_0.1.1                 reshape2_1.4.2             
 [85] AnnotationDbi_1.38.1        munsell_0.4.3               tools_3.4.0                 downloader_0.4             
 [89] RSQLite_1.1-2               broom_0.4.2                 stringr_1.2.0               knitr_1.16                 
 [93] robustbase_0.92-7           survMisc_0.5.4              purrr_0.2.2.2               KEGGREST_1.16.0            
 [97] dendextend_1.5.2            EDASeq_2.10.0               nlme_3.1-131                whisker_0.3-2              
[101] R.oo_1.21.0                 KEGGgraph_1.34.0            DO.db_2.9                   xml2_1.1.1                 
[105] biomaRt_2.32.1              compiler_3.4.0              curl_2.6                    png_0.1-7                  
[109] tibble_1.3.3                geneplotter_1.54.0          stringi_1.1.5               GenomicFeatures_1.28.3     
[113] lattice_0.20-35             trimcluster_0.1-2           Matrix_1.2-10               psych_1.7.5                
[117] KMsurv_0.1-5                GlobalOptions_0.0.12        data.table_1.10.4           bitops_1.0-6               
[121] rtracklayer_1.36.3          GenomicRanges_1.28.3        qvalue_2.8.0                R6_2.2.2                   
[125] latticeExtra_0.6-28         hwriter_1.3.2               ShortRead_1.34.0            gridExtra_2.2.1            
[129] IRanges_2.10.2              codetools_0.2-15            MASS_7.3-47                 assertthat_0.2.0           
[133] SummarizedExperiment_1.6.3  rjson_0.2.15                GenomicAlignments_1.12.1    Rsamtools_1.28.0           
[137] mnormt_1.5-5                S4Vectors_0.14.3            GenomeInfoDbData_0.99.0     diptest_0.75-7             
[141] parallel_3.4.0              hms_0.3                     clusterProfiler_3.4.3       grid_3.4.0                 
[145] tidyr_0.6.3                 class_7.3-14                rvcheck_0.0.8               ggpubr_0.1.3               
[149] Biobase_2.36.2       

 

ADD COMMENTlink modified 5 months ago • written 5 months ago by f.geist0
0
gravatar for f.geist
5 months ago by
f.geist0
f.geist0 wrote:

So.. Apperently it's an error in the summarizedexperiment generation, as the GDCprepare works if I use the following code:

df <- GDCprepare(query,
                 save=TRUE,
                 save.filename = "Gene_Expression_Quantification.rda",
                 summarizedExperiment = FALSE)

Nevertheless I would like to get a summarizedExperiment, as I would like to combine the expression data easily to the clinical data.

thank you so much for your help!

Felix Geist

 

PhD student dkfz Heidelberg

 

ADD COMMENTlink written 5 months ago by f.geist0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.2.0
Traffic: 256 users visited in the last hour