dataAssy object is giving me the exact same output as in your case, though I am getting some warnings in the previous steps and error in colnames..I have pasted the complete session output below.
> library(TCGAbiolinks)
Warning messages:
1: replacing previous import by ‘grid::arrow’ when loading ‘TCGAbiolinks’
2: replacing previous import by ‘grid::unit’ when loading ‘TCGAbiolinks’
> library(SummarizedExperiment)
Loading required package: GenomicRanges
Loading required package: BiocGenerics
Loading required package: parallel
Attaching package: ‘BiocGenerics’
The following objects are masked from ‘package:parallel’:
    clusterApply, clusterApplyLB, clusterCall,
    clusterEvalQ, clusterExport, clusterMap,
    parApply, parCapply, parLapply, parLapplyLB,
    parRapply, parSapply, parSapplyLB
The following objects are masked from ‘package:stats’:
    IQR, mad, xtabs
The following objects are masked from ‘package:base’:
    anyDuplicated, append, as.data.frame,
    as.vector, cbind, colnames, do.call,
    duplicated, eval, evalq, Filter, Find, get,
    grep, grepl, intersect, is.unsorted, lapply,
    lengths, Map, mapply, match, mget, order,
    paste, pmax, pmax.int, pmin, pmin.int,
    Position, rank, rbind, Reduce, rownames,
    sapply, setdiff, sort, table, tapply, union,
    unique, unlist, unsplit
Loading required package: S4Vectors
Loading required package: stats4
Loading required package: IRanges
Loading required package: GenomeInfoDb
Loading required package: Biobase
Welcome to Bioconductor
    Vignettes contain introductory material;
    view with 'browseVignettes()'. To cite
    Bioconductor, see 'citation("Biobase")', and
    for packages 'citation("pkgname")'.
> library(TCGAbiolinks)
> 
> cancer <- "BRCA"
> PlatformCancer <- "IlluminaHiSeq_RNASeqV2"
> dataType <- "rsem.genes.results"
> pathCancer <- paste0("../data",cancer)
> 
> datQuery <- TCGAquery(tumor = cancer, platform = PlatformCancer, level = "3")
> lsSample <- TCGAquery_samplesfilter(query = datQuery)
> 
> # get subtype information
> dataSubt <- TCGAquery_subtype(tumor = cancer)
> 
> # Which samples are Primary Solid Tumor
> dataSmTP <- TCGAquery_SampleTypes(barcode = lsSample$IlluminaHiSeq_RNASeqV2, typesample = "TP")
> 
> # Which samples are Solid Tissue Normal
> dataSmTN <- TCGAquery_SampleTypes(barcode = lsSample$IlluminaHiSeq_RNASeqV2, typesample ="NT")
> 
> # get clinical data
> dataClin <- TCGAquery_clinic(tumor = cancer, clinical_data_type = "clinical_patient")
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
| Downloading:1 files
| Path:./nationwidechildrens.org_BRCA.bio.Level_2.0.42.0
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
  |=============================================| 100%
Tumor type: BRCA
  |                                             |   0%
Adding disease collumn to data frame
> 
> TCGAdownload(data = datQuery,
+              path = pathCancer,
+              type = dataType,
+              samples = c(dataSmTP,dataSmTN))
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
| Downloading:1211 files
| Path:../dataBRCA/unc.edu_BRCA.IlluminaHiSeq_RNASeqV2.Level_3.1.11.0
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
  |=============================================| 100%
> 
> dataAssy <- TCGAprepare(query = datQuery,
+                         dir = pathCancer,
+                         type = dataType,
+                         save = TRUE,
+                         summarizedExperiment = TRUE,
+                         samples = c(dataSmTP,dataSmTN),
+                         filename = paste0(cancer,"_",PlatformCancer,".rda"))
  |=============================================================================================================| 100%
Adding metadata to the rse object...
Saving the data...
Data saved in: BRCA_IlluminaHiSeq_RNASeqV2.rda
Warning messages:
1: In fread(files[i], header = TRUE, sep = "\t", stringsAsFactors = FALSE) :
  Stopped reading at empty line 14607 but text exists afterwards (discarded): RALA|589
2: In data.table::data.table(...) :
  Item 2 is of size 14605 but maximum size is 20531 (recycled leaving remainder of 5926 items)
3: In fread(files[i], header = TRUE, sep = "\t", stringsAsFactors = FALSE) :
  Stopped reading at empty line 2346 but text exists afterwards (discarded): C21orf57|54059    500.0
4: In data.table::data.table(...) :
  Item 2 is of size 2344 but maximum size is 20531 (recycled leaving remainder of 1779 items)
> 
> dataAssy
class: RangedSummarizedExperiment 
dim: 20330 1211 
metadata(3): Query: TCGAprepareParameters
  FilesInfo:
assays(2): raw_counts scaled_estimate
rownames(20330): A1BG|1 A1CF|29974 ...
  ZZEF1|23140 ZZZ3|26009
rowRanges metadata column names(3): gene_id
  entrezgene
  transcript_id.transcript_id_TCGA-E9-A1RD-11A-33R-A157-07
colnames(1211): TCGA-E9-A1RD-11A-33R-A157-07
  TCGA-E9-A1RC-01A-11R-A157-07 ...
  TCGA-D8-A1J9-01A-11R-A13Q-07
  TCGA-AC-A6IX-01A-12R-A32P-07
colData names(10): sample patient ... Siglust
  PAM50
> dataPrep <- TCGAanalyze_Preprocessing(object = dataAssy, cor.cut = 0.6)
Error in `colnames<-`(`*tmp*`, value = c("TCGA-E9-A1RD-11A-33R-A157-07",  : 
  length of 'dimnames' [2] not equal to array extent
  
  > sessionInfo()
R version 3.2.3 (2015-12-10)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 7 x64 (build 7601) Service Pack 1
locale:
[1] LC_COLLATE=English_India.1252  LC_CTYPE=English_India.1252    LC_MONETARY=English_India.1252 LC_NUMERIC=C                   LC_TIME=English_India.1252    
attached base packages:
[1] stats4    parallel  stats     graphics  grDevices utils     datasets  methods   base     
other attached packages:
[1] SummarizedExperiment_1.0.2 Biobase_2.30.0             GenomicRanges_1.22.4       GenomeInfoDb_1.6.3         IRanges_2.4.6             
[6] S4Vectors_0.8.11           BiocGenerics_0.16.1        TCGAbiolinks_1.0.5        
loaded via a namespace (and not attached):
  [1] nlme_3.1-122                            bitops_1.0-6                            matrixStats_0.50.1                     
  [4] devtools_1.10.0                         doParallel_1.0.10                       RColorBrewer_1.1-2                     
  [7] httr_1.1.0                              Rgraphviz_2.14.0                        tools_3.2.3                            
 [10] R6_2.1.2                                affyio_1.40.0                           KernSmooth_2.23-15                     
 [13] DBI_0.3.1                               colorspace_1.2-6                        GGally_1.0.1                           
 [16] preprocessCore_1.32.0                   chron_2.3-47                            graph_1.48.0                           
 [19] rvest_0.3.1                             xml2_0.1.2                              sandwich_2.3-4                         
 [22] rtracklayer_1.30.1                      caTools_1.17.1                          scales_0.3.0                           
 [25] hexbin_1.27.1                           mvtnorm_1.0-5                           genefilter_1.52.1                      
 [28] affy_1.48.0                             DESeq_1.22.1                            stringr_1.0.0                          
 [31] supraHex_1.8.0                          digest_0.6.9                            Rsamtools_1.22.0                       
 [34] R.utils_2.2.0                           XVector_0.10.0                          limma_3.26.7                           
 [37] RSQLite_1.0.0                           BiocInstaller_1.20.1                    TxDb.Hsapiens.UCSC.hg19.knownGene_3.2.2
 [40] zoo_1.7-12                              hwriter_1.3.2                           BiocParallel_1.4.3                     
 [43] gtools_3.5.0                            xlsx_0.5.7                              dplyr_0.4.3                            
 [46] R.oo_1.19.0                             RCurl_1.95-4.7                          magrittr_1.5                           
 [49] modeltools_0.2-21                       heatmap.plus_1.3                        futile.logger_1.4.1                    
 [52] Matrix_1.2-3                            Rcpp_0.12.3                             munsell_0.4.2                          
 [55] ape_3.4                                 R.methodsS3_1.7.0                       stringi_1.0-1                          
 [58] multcomp_1.4-3                          edgeR_3.12.0                            MASS_7.3-45                            
 [61] zlibbioc_1.16.0                         gplots_2.17.0                           plyr_1.8.3                             
 [64] grid_3.2.3                              gdata_2.17.0                            lattice_0.20-33                        
 [67] Biostrings_2.38.3                       splines_3.2.3                           xlsxjars_0.6.1                         
 [70] GenomicFeatures_1.22.12                 annotate_1.48.0                         EDASeq_2.4.1                           
 [73] igraph_1.0.1                            rjson_0.2.15                            geneplotter_1.48.0                     
 [76] codetools_0.2-14                        biomaRt_2.26.1                          futile.options_1.0.0                   
 [79] XML_3.98-1.3                            ShortRead_1.28.0                        downloader_0.4                         
 [82] latticeExtra_0.6-26                     lambda.r_1.1.7                          data.table_1.9.6                       
 [85] foreach_1.4.3                           gtable_0.1.2                            reshape_0.8.5                          
 [88] assertthat_0.1                          ggplot2_2.0.0                           dnet_1.0.7                             
 [91] aroma.light_3.0.0                       coin_1.1-2                              xtable_1.8-0                           
 [94] ConsensusClusterPlus_1.24.0             survival_2.38-3                         rJava_0.9-8                            
 [97] iterators_1.0.8                         GenomicAlignments_1.6.3                 AnnotationDbi_1.32.3                   
[100] memoise_1.0.0                           cluster_2.0.3                           TH.data_1.0-7   
  
                    
                
                 
One possibility is that one of the many packages you have loaded define a generic
colnames<-ordimnames<-that interferes with the version of these functions TCGAbiolinks is expecting. Immediately after the error occurs, run the commandtraceback()> dataPrep <- TCGAanalyze_Preprocessing(object = dataAssy, cor.cut = 0.6) Error in `colnames<-`(`*tmp*`, value = c("TCGA-E9-A1RD-11A-33R-A157-07", : length of 'dimnames' [2] not equal to array extent > traceback() ## WHAT IS YOUR OUTPUT HERE?Also, it would be helpful to see the output of
> selectMethod("colnames<-", c(class(dataAssy), "character")) ## expect: Error... no method found for signature... > selectMethod("dimnames<-", c(class(dataAssy), "list")) ## expect: a method definition from the SummarizedExperiment packageOk, I ran in windows and I still have no problem. But I don't have these warnings below. Maybe some files were corrupted during download. Please, could you send me your dataAssy object?
1: In fread(files[i], header = TRUE, sep = "\t", stringsAsFactors = FALSE) :Stopped reading at empty line 14607 but text exists afterwards (discarded): RALA|589
2: In data.table::data.table(...) :
Item 2 is of size 14605 but maximum size is 20531 (recycled leaving remainder of 5926 items)
3: In fread(files[i], header = TRUE, sep = "\t", stringsAsFactors = FALSE) :
Stopped reading at empty line 2346 but text exists afterwards (discarded): C21orf57|54059 500.0
4: In data.table::data.table(...) :
Item 2 is of size 2344 but maximum size is 20531 (recycled leaving remainder of 1779 items)
Here is the link to dataAssay object which has been created on my system
https://drive.google.com/file/d/0B9SRy5XoOWiENlU4UmNTMVBYRjA/view?usp=sharing
Your object is equal to mine except for two samples. I believe somehow some files were corrupted during download. And the package does not check for the data integrity.
> assay(dataAssy)[15000:15002,492]
SCO1|6341 SCO2|9997 SCOC|60592
5.00 4015.42 3398.00
> assay(dataAssy2)[15000:15002,492]
SCO1|6341 SCO2|9997 SCOC|60592
1731 644 9153
The two files are unc.edu.6130b450-8b88-4a9a-b462-a34ec94183c9.1163157.rsem.genes.results and unc.edu.97b5ef6f-d621-4093-ab77-d60dcf706173.1152807.rsem.genes.results.
Could you remove them from dataBRCA and run TCGADownload and TCGAPrepare again?
Also my object is here
You can run this command to see if the prepared data are equal
Also TCGAanalyze_Preprocessing had a bug. The fix should be in bioconductor tonight (version 1.0.7). It is available in the github repository.
Your object is equal to mine except for two samples. I believe somehow some files were corrupted during download. And the package does not check for the data integrity.
> assay(dataAssy)[15000:15002,492]
SCO1|6341 SCO2|9997 SCOC|60592
5.00 4015.42 3398.00
> assay(dataAssy2)[15000:15002,492]
SCO1|6341 SCO2|9997 SCOC|60592
1731 644 9153
The two files are unc.edu.6130b450-8b88-4a9a-b462-a34ec94183c9.1163157.rsem.genes.results and unc.edu.97b5ef6f-d621-4093-ab77-d60dcf706173.1152807.rsem.genes.results.
Could you remove them from dataBRCA and run TCGADownload and TCGAPrepare again?
Also my object is here
You can run this command to see if the prepared data are equal
Also TCGAanalyze_Preprocessing had a bug. The fix should be in bioconductor tonight (version 1.0.7). It is available in the github repository.
We just added the code to check data integrety, it is still in test. But could you reinstall and run your code again?
The corrupted files should be redownloaded and TCGAPrepare should not show more warnings,
@tiagochst Thank You for looking into this error, I installed the new version of TCGA which is 1.0.7 and tried running through case 1. I am no more getting the Error in 'colnames' issue, but now the error is at data filtering and data DEGs step..
here is complete run code including trace back and session info as well as comparison of the datAssay object
> library(SummarizedExperiment) > library(TCGAbiolinks) > > cancer <- "BRCA" > PlatformCancer <- "IlluminaHiSeq_RNASeqV2" > dataType <- "rsem.genes.results" > pathCancer <- paste0("../data",cancer) > > datQuery <- TCGAquery(tumor = cancer, platform = PlatformCancer, level = "3") > lsSample <- TCGAquery_samplesfilter(query = datQuery) > > # get subtype information > dataSubt <- TCGAquery_subtype(tumor = cancer) > > # Which samples are Primary Solid Tumor > dataSmTP <- TCGAquery_SampleTypes(barcode = lsSample$IlluminaHiSeq_RNASeqV2, typesample = "TP") > > # Which samples are Solid Tissue Normal > dataSmTN <- TCGAquery_SampleTypes(barcode = lsSample$IlluminaHiSeq_RNASeqV2, typesample ="NT") > > # get clinical data > dataClin <- TCGAquery_clinic(tumor = cancer, clinical_data_type = "clinical_patient") -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= | Downloading:1 files | Path:./nationwidechildrens.org_BRCA.bio.Level_2.0.43.0 -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= |====================================================================================================================| 100% Tumor type: BRCA | | 0% Adding disease collumn to data frame > > TCGAdownload(data = datQuery, + path = pathCancer, + type = dataType, + samples = c(dataSmTP,dataSmTN)) -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= | Downloading:1211 files | Path:../dataBRCA/unc.edu_BRCA.IlluminaHiSeq_RNASeqV2.Level_3.1.11.0 -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= |====================================================================================================================| 100% > > dataAssy <- TCGAprepare(query = datQuery, + dir = pathCancer, + type = dataType, + save = TRUE, + summarizedExperiment = TRUE, + samples = c(dataSmTP,dataSmTN), + filename = paste0(cancer,"_",PlatformCancer,".rda")) |====================================================================================================================| 100% Adding batch info to summarizedExperiment object Adding metadata to the rse object... Saving the data... Data saved in: BRCA_IlluminaHiSeq_RNASeqV2.rda Warning messages: 1: In fread(files[i], header = TRUE, sep = "\t", stringsAsFactors = FALSE) : Stopped reading at empty line 14607 but text exists afterwards (discarded): RALA|589 2: In data.table::data.table(...) : Item 2 is of size 14605 but maximum size is 20531 (recycled leaving remainder of 5926 items) 3: In fread(files[i], header = TRUE, sep = "\t", stringsAsFactors = FALSE) : Stopped reading at empty line 2346 but text exists afterwards (discarded): C21orf57|54059 500.0 4: In data.table::data.table(...) : Item 2 is of size 2344 but maximum size is 20531 (recycled leaving remainder of 1779 items) > > dataPrep <- TCGAanalyze_Preprocessing(object = dataAssy, cor.cut = 0.6) > dataNorm <- TCGAanalyze_Normalization(tabDF = dataPrep, geneInfo = geneInfo, method = "gcContent") [1] "I Need about 307 seconds for this Complete Normalization Upper Quantile [Processing 80k elements /s] " [1] "Step 1 of 4: newSeqExpressionSet ..." [1] "Step 2 of 4: withinLaneNormalization ..." [1] "Step 3 of 4: betweenLaneNormalization ..." [1] "Step 4 of 4: .quantileNormalization ..."Warning message: In geneNames[, 1] == names(tmp[which(tmp > 1)]) : longer object length is not a multiple of shorter object length> dataFilt <- TCGAanalyze_Filtering(tabDF = dataNorm, method = "quantile", qnt.cut = 0.25) > dataDEGs <- TCGAanalyze_DEA(mat1 = dataFilt[,dataSmTN], mat2 = dataFilt[,dataSmTP], Cond1type = "Normal", + Cond2type = "Tumor", fdr.cut = 0.01 , logFC.cut = 1, method = "glmLRT")Error in dataFilt[, dataSmTN] : subscript out of bounds > traceback() 2: cbind(mat1, mat2) 1: TCGAanalyze_DEA(mat1 = dataFilt[, dataSmTN], mat2 = dataFilt[, dataSmTP], Cond1type = "Normal", Cond2type = "Tumor", fdr.cut = 0.01, logFC.cut = 1, method = "glmLRT") > sessionInfo() R version 3.2.3 (2015-12-10) Platform: x86_64-w64-mingw32/x64 (64-bit) Running under: Windows 7 x64 (build 7601) Service Pack 1 locale: [1] LC_COLLATE=English_India.1252 LC_CTYPE=English_India.1252 LC_MONETARY=English_India.1252 [4] LC_NUMERIC=C LC_TIME=English_India.1252 attached base packages: [1] stats4 parallel stats graphics grDevices utils datasets methods base other attached packages: [1] TCGAbiolinks_1.0.7 SummarizedExperiment_1.0.2 Biobase_2.30.0 [4] GenomicRanges_1.22.4 GenomeInfoDb_1.6.3 IRanges_2.4.6 [7] S4Vectors_0.8.11 BiocGenerics_0.16.1 loaded via a namespace (and not attached): [1] nlme_3.1-122 bitops_1.0-6 [3] matrixStats_0.50.1 devtools_1.10.0 [5] doParallel_1.0.10 RColorBrewer_1.1-2 [7] httr_1.1.0 Rgraphviz_2.14.0 [9] tools_3.2.3 R6_2.1.2 [11] affyio_1.40.0 KernSmooth_2.23-15 [13] DBI_0.3.1 colorspace_1.2-6 [15] GGally_1.0.1 preprocessCore_1.32.0 [17] chron_2.3-47 graph_1.48.0 [19] rvest_0.3.1 xml2_0.1.2 [21] sandwich_2.3-4 rtracklayer_1.30.2 [23] caTools_1.17.1 scales_0.3.0 [25] hexbin_1.27.1 mvtnorm_1.0-5 [27] genefilter_1.52.1 affy_1.48.0 [29] DESeq_1.22.1 stringr_1.0.0 [31] supraHex_1.8.0 digest_0.6.9 [33] Rsamtools_1.22.0 R.utils_2.2.0 [35] XVector_0.10.0 limma_3.26.8 [37] RSQLite_1.0.0 BiocInstaller_1.20.1 [39] TxDb.Hsapiens.UCSC.hg19.knownGene_3.2.2 zoo_1.7-12 [41] hwriter_1.3.2 BiocParallel_1.4.3 [43] gtools_3.5.0 xlsx_0.5.7 [45] dplyr_0.4.3 R.oo_1.19.0 [47] RCurl_1.95-4.7 magrittr_1.5 [49] modeltools_0.2-21 heatmap.plus_1.3 [51] futile.logger_1.4.1 Matrix_1.2-3 [53] Rcpp_0.12.3 munsell_0.4.3 [55] ape_3.4 R.methodsS3_1.7.0 [57] stringi_1.0-1 multcomp_1.4-3 [59] edgeR_3.12.0 MASS_7.3-45 [61] zlibbioc_1.16.0 gplots_2.17.0 [63] plyr_1.8.3 grid_3.2.3 [65] gdata_2.17.0 lattice_0.20-33 [67] Biostrings_2.38.4 splines_3.2.3 [69] xlsxjars_0.6.1 GenomicFeatures_1.22.13 [71] annotate_1.48.0 EDASeq_2.4.1 [73] igraph_1.0.1 rjson_0.2.15 [75] geneplotter_1.48.0 codetools_0.2-14 [77] biomaRt_2.26.1 futile.options_1.0.0 [79] XML_3.98-1.3 ShortRead_1.28.0 [81] downloader_0.4 latticeExtra_0.6-28 [83] lambda.r_1.1.7 data.table_1.9.6 [85] foreach_1.4.3 gtable_0.1.2 [87] reshape_0.8.5 assertthat_0.1 [89] ggplot2_2.0.0 dnet_1.0.7 [91] aroma.light_3.0.0 coin_1.1-2 [93] xtable_1.8-2 ConsensusClusterPlus_1.24.0 [95] survival_2.38-3 rJava_0.9-8 [97] iterators_1.0.8 GenomicAlignments_1.6.3 [99] AnnotationDbi_1.32.3 memoise_1.0.0 [101] cluster_2.0.3 TH.data_1.0-7And upon comparing the datAssay object of yours and the one generate on my system ( can be accessed @ https://drive.google.com/open?id=0B9SRy5XoOWiEbllNalIxaTZSVk0), its not same and differs as following
As there were outliers, the data frame lost some columns. But
is considering the object has them all. The code should be something like?
dataDEGs <- TCGAanalyze_DEA(mat1 = subset(dataFilt, select = colnames(dataFilt) %in% dataSmTN), mat2 = subset(dataFilt, select = colnames(dataFilt) %in% dataSmTP), Cond1type = "Normal", Cond2type = "Tumor", fdr.cut = 0.01 , logFC.cut = 1, method = "glmLRT")About the data download, I saw you are using 1.0.7, it does not have the verification of downloaded files. So it is not going to correct the files. Please install it with:
and rerun the code.
I am not able to install from devtools. Am I doing something wrong??
nlme is outdated. Try updating the packages or installing the last version manually https://cran.r-project.org/web/packages/nlme/index.html (the last version is 3.1-124)
updated nlme; Now the installation process has come up with three more errors
Strangely, one of the complaint is for java, but java on my system is updated:
> system("java -version") java version "1.8.0_73" Java(TM) SE Runtime Environment (build 1.8.0_73-b02) Java HotSpot(TM) 64-Bit Server VM (build 25.73-b02, mixed mode)How to resolve this?
I am able to install TCGAbiolinks from git hub repo with few warnings and while downloading the corrupted file, it is entering into an infinite loop..
> devtools::install_github(repo = "BioinformaticsFMRP/TCGAbiolinks") Downloading GitHub repo BioinformaticsFMRP/TCGAbiolinks@master from URL https://api.github.com/repos/BioinformaticsFMRP/TCGAbiolinks/zipball/master Installing TCGAbiolinks Skipping 2 unavailable packages: ALL, TxDb.Hsapiens.UCSC.hg19.knownGene Installing 1 package: IRanges Warning: package ‘IRanges’ is in use and will not be installed "C:/PROGRA~1/R/R-32~1.3/bin/x64/R" --no-site-file --no-environ --no-save --no-restore CMD INSTALL \ "C:/Users/bioxcel/AppData/Local/Temp/Rtmp0gncV8/devtools147cf45485a/BioinformaticsFMRP-TCGAbiolinks-3b6b954" \ --library="C:/Users/bioxcel/Documents/R/win-library/3.2" --install-tests * installing *source* package 'TCGAbiolinks' ... ** R ** data *** moving datasets to lazyload DB ** inst ** tests ** preparing package for lazy loading Note: the specification for S3 class "family" in package 'MatrixModels' seems equivalent to one from package 'lme4': not turning on duplicate class definitions for this class. ** help *** installing help indices ** building package indices ** installing vignettes ** testing if installed package can be loaded Note: the specification for S3 class "family" in package 'MatrixModels' seems equivalent to one from package 'lme4': not turning on duplicate class definitions for this class. * DONE (TCGAbiolinks) > library(SummarizedExperiment) > library(TCGAbiolinks) > > cancer <- "BRCA" > PlatformCancer <- "IlluminaHiSeq_RNASeqV2" > dataType <- "rsem.genes.results" > pathCancer <- paste0("../data",cancer) > > datQuery <- TCGAquery(tumor = cancer, platform = PlatformCancer, level = "3") > lsSample <- TCGAquery_samplesfilter(query = datQuery) > > # get subtype information > dataSubt <- TCGAquery_subtype(tumor = cancer) > > # Which samples are Primary Solid Tumor > dataSmTP <- TCGAquery_SampleTypes(barcode = lsSample$IlluminaHiSeq_RNASeqV2, typesample = "TP") > > # Which samples are Solid Tissue Normal > dataSmTN <- TCGAquery_SampleTypes(barcode = lsSample$IlluminaHiSeq_RNASeqV2, typesample ="NT") > > # get clinical data > dataClin <- TCGAquery_clinic(tumor = cancer, clinical_data_type = "clinical_patient") % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0 0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0100 765 100 765 0 0 590 0 0:00:01 0:00:01 --:--:-- 598 -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= | Downloading:1 files | Path:./nationwidechildrens.org_BRCA.bio.Level_2.0.43.0 -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= | | 0% [1] nationwidechildrens.org_clinical_patient_brca.txt This downloaded file might be corrupted, we are downloading it again. [1] nationwidechildrens.org_clinical_patient_brca.txt This downloaded file might be corrupted, we are downloading it again. [1] nationwidechildrens.org_clinical_patient_brca.txt This downloaded file might be corrupted, we are downloading it again. [1] nationwidechildrens.org_clinical_patient_brca.txt This downloaded file might be corrupted, we are downloading it again. [1] nationwidechildrens.org_clinical_patient_brca.txt This downloaded file might be corrupted, we are downloading it again. [1] nationwidechildrens.org_clinical_patient_brca.txt This downloaded file might be corrupted, we are downloading it again. [1] nationwidechildrens.org_clinical_patient_brca.txt This downloaded file might be corrupted, we are downloading it again. [1] nationwidechildrens.org_clinical_patient_brca.txt This downloaded file might be corrupted, we are downloading it again. [1] nationwidechildrens.org_clinical_patient_brca.txt This downloaded file might be corrupted, we are downloading it again. [1] nationwidechildrens.org_clinical_patient_brca.txt This downloaded file might be corrupted, we are downloading it again. [1] nationwidechildrens.org_clinical_patient_brca.txt This downloaded file might be corrupted, we are downloading it again. [1] nationwidechildrens.org_clinical_patient_brca.txt This downloaded file might be corrupted, we are downloading it again. [1] nationwidechildrens.org_clinical_patient_brca.txt This downloaded file might be corrupted, we are downloading it again. [1] nationwidechildrens.org_clinical_patient_brca.txt This downloaded file might be corrupted, we are downloading it again. [1] nationwidechildrens.org_clinical_patient_brca.txt This downloaded file might be corrupted, we are downloading it again. [1] nationwidechildrens.org_clinical_patient_brca.txt This downloaded file might be corrupted, we are downloading it again. [1] nationwidechildrens.org_clinical_patient_brca.txtStill not able to install from github on Windows R studio, here is the
> devtools::install_github(repo = "BioinformaticsFMRP/TCGAbiolinks")
Downloading GitHub repo BioinformaticsFMRP/TCGAbiolinks@master
from URL https://api.github.com/repos/BioinformaticsFMRP/TCGAbiolinks/zipball/master
Installing TCGAbiolinks
Skipping 2 unavailable packages: ALL, TxDb.Hsapiens.UCSC.hg19.knownGene
Installing 1 package: IRanges
Warning: package ‘IRanges’ is in use and will not be installed
"C:/PROGRA~1/R/R-32~1.3/bin/x64/R" --no-site-file --no-environ --no-save --no-restore CMD INSTALL \
"C:/Users/bioxcel/AppData/Local/Temp/Rtmp0gncV8/devtools147c13c166eb/BioinformaticsFMRP-TCGAbiolinks-c73bb0f" \
--library="C:/Users/bioxcel/Documents/R/win-library/3.2" --install-tests
* installing *source* package 'TCGAbiolinks' ...
Warning in file.copy(f, instdir, TRUE) :
problem copying .\NAMESPACE to C:\Users\bioxcel\Documents\R\win-library\3.2\TCGAbiolinks\NAMESPACE: Permission denied
Warning in file(file, ifelse(append, "a", "w")) :
cannot open file 'C:/Users/bioxcel/Documents/R/win-library/3.2/TCGAbiolinks/DESCRIPTION': No such file or directory
Error in file(file, ifelse(append, "a", "w")) :
cannot open the connection
ERROR: installing package DESCRIPTION failed for package 'TCGAbiolinks'
* restoring previous 'C:/Users/bioxcel/Documents/R/win-library/3.2/TCGAbiolinks'
Warning in file.copy(lp, dirname(pkgdir), recursive = TRUE, copy.date = TRUE) :
problem copying C:\Users\bioxcel\Documents\R\win-library\3.2\00LOCK-BioinformaticsFMRP-TCGAbiolinks-c73bb0f\TCGAbiolinks\CITATION to C:\Users\bioxcel\Documents\R\win-library\3.2\TCGAbiolinks\CITATION: No such file or directory
Warning in file.copy(lp, dirname(pkgdir), recursive = TRUE, copy.date = TRUE) :
problem creating directory C:\Users\bioxcel\Documents\R\win-library\3.2\TCGAbiolinks\data: No such file or directory
Warning in file.copy(lp, dirname(pkgdir), recursive = TRUE, copy.date = TRUE) :
problem copying C:\Users\bioxcel\Documents\R\win-library\3.2\00LOCK-BioinformaticsFMRP-TCGAbiolinks-c73bb0f\TCGAbiolinks\DESCRIPTION to C:\Users\bioxcel\Documents\R\win-library\3.2\TCGAbiolinks\DESCRIPTION: No such file or directory
Warning in file.copy(lp, dirname(pkgdir), recursive = TRUE, copy.date = TRUE) :
problem creating directory C:\Users\bioxcel\Documents\R\win-library\3.2\TCGAbiolinks\help: No such file or directory
Warning in file.copy(lp, dirname(pkgdir), recursive = TRUE, copy.date = TRUE) :
problem creating directory C:\Users\bioxcel\Documents\R\win-library\3.2\TCGAbiolinks\html: No such file or directory
Warning in file.copy(lp, dirname(pkgdir), recursive = TRUE, copy.date = TRUE) :
problem copying C:\Users\bioxcel\Documents\R\win-library\3.2\00LOCK-BioinformaticsFMRP-TCGAbiolinks-c73bb0f\TCGAbiolinks\INDEX to C:\Users\bioxcel\Documents\R\win-library\3.2\TCGAbiolinks\INDEX: No such file or directory
Warning in file.copy(lp, dirname(pkgdir), recursive = TRUE, copy.date = TRUE) :
problem creating directory C:\Users\bioxcel\Documents\R\win-library\3.2\TCGAbiolinks\Meta: No such file or directory
Warning in file.copy(lp, dirname(pkgdir), recursive = TRUE, copy.date = TRUE) :
problem copying C:\Users\bioxcel\Documents\R\win-library\3.2\00LOCK-BioinformaticsFMRP-TCGAbiolinks-c73bb0f\TCGAbiolinks\NAMESPACE to C:\Users\bioxcel\Documents\R\win-library\3.2\TCGAbiolinks\NAMESPACE: No such file or directory
Warning in file.copy(lp, dirname(pkgdir), recursive = TRUE, copy.date = TRUE) :
problem copying C:\Users\bioxcel\Documents\R\win-library\3.2\00LOCK-BioinformaticsFMRP-TCGAbiolinks-c73bb0f\TCGAbiolinks\NEWS to C:\Users\bioxcel\Documents\R\win-library\3.2\TCGAbiolinks\NEWS: No such file or directory
Warning in file.copy(lp, dirname(pkgdir), recursive = TRUE, copy.date = TRUE) :
problem creating directory C:\Users\bioxcel\Documents\R\win-library\3.2\TCGAbiolinks\R: No such file or directory
Warning in file.copy(lp, dirname(pkgdir), recursive = TRUE, copy.date = TRUE) :
problem creating directory C:\Users\bioxcel\Documents\R\win-library\3.2\TCGAbiolinks\tests: No such file or directory
Error: Command failed (1)
> traceback()
10: stop("Command failed (", status, ")", call. = FALSE)
9: system_check(r_path, options, c(r_profile(), r_env_vars(), env_vars),
...)
8: force(code)
7: withr::with_dir(path, system_check(r_path, options, c(r_profile(),
r_env_vars(), env_vars), ...))
6: R(paste("CMD INSTALL ", shQuote(built_path), " ", opts, sep = ""),
quiet = quiet)
5: install(source, ..., quiet = quiet, metadata = metadata)
4: FUN(X[[i]], ...)
3: vapply(remotes, install_remote, ..., FUN.VALUE = logical(1))
2: install_remotes(remotes, quiet = quiet, ...)
1: devtools::install_github(repo = "BioinformaticsFMRP/TCGAbiolinks")
As you said this might be windows specific error, so I tried it with RStudio server, though I am able to install TCGAbiolinks but error while downloading the file, it seems to me that TCGA link for the BRCA RNASeq datasets has changed
> library(SummarizedExperiment) Loading required package: GenomicRanges Loading required package: BiocGenerics Loading required package: parallel Attaching package: ‘BiocGenerics’ The following objects are masked from ‘package:parallel’: clusterApply, clusterApplyLB, clusterCall, clusterEvalQ, clusterExport, clusterMap, parApply, parCapply, parLapply, parLapplyLB, parRapply, parSapply, parSapplyLB The following objects are masked from ‘package:stats’: IQR, mad, xtabs The following objects are masked from ‘package:base’: anyDuplicated, append, as.data.frame, as.vector, cbind, colnames, do.call, duplicated, eval, evalq, Filter, Find, get, grep, grepl, intersect, is.unsorted, lapply, lengths, Map, mapply, match, mget, order, paste, pmax, pmax.int, pmin, pmin.int, Position, rank, rbind, Reduce, rownames, sapply, setdiff, sort, table, tapply, union, unique, unlist, unsplit Loading required package: S4Vectors Loading required package: stats4 Loading required package: IRanges Loading required package: GenomeInfoDb Loading required package: Biobase Welcome to Bioconductor Vignettes contain introductory material; view with 'browseVignettes()'. To cite Bioconductor, see 'citation("Biobase")', and for packages 'citation("pkgname")'. > library(TCGAbiolinks) > > cancer <- "BRCA" > PlatformCancer <- "IlluminaHiSeq_RNASeqV2" > dataType <- "rsem.genes.results" > pathCancer <- paste0("../data",cancer) > > datQuery <- TCGAquery(tumor = cancer, platform = PlatformCancer, level = "3") > lsSample <- TCGAquery_samplesfilter(query = datQuery) > > # get subtype information > dataSubt <- TCGAquery_subtype(tumor = cancer) > > # Which samples are Primary Solid Tumor > dataSmTP <- TCGAquery_SampleTypes(barcode = lsSample$IlluminaHiSeq_RNASeqV2, typesample = "TP") > > # Which samples are Solid Tissue Normal > dataSmTN <- TCGAquery_SampleTypes(barcode = lsSample$IlluminaHiSeq_RNASeqV2, typesample ="NT") > > # get clinical data > dataClin <- TCGAquery_clinic(tumor = cancer, clinical_data_type = "clinical_patient") % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0100 765 100 765 0 0 1016 0 --:--:-- --:--:-- --:--:-- 1017 -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= | Downloading:1 files | Path:./nationwidechildrens.org_BRCA.bio.Level_2.0.43.0 -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= | | 0% [1] nationwidechildrens.org_clinical_patient_brca.txt |================================================================================| 100% Tumor type: BRCA | | 0% Adding disease collumn to data frame > > TCGAdownload(data = datQuery, + path = pathCancer, + type = dataType, + samples = c(dataSmTP,dataSmTN)) % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0 0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0100 80970 0 80970 0 0 56407 0 --:--:-- 0:00:01 --:--:-- 56385100 216k 0 216k 0 0 128k 0 --:--:-- 0:00:01 --:--:-- 128k -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= | Downloading:1211 files | Path:../dataBRCA/unc.edu_BRCA.IlluminaHiSeq_RNASeqV2.Level_3.1.11.0 -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= | | 0% [1] unc.edu.000d877f-8d03-44bc-8607-27b5ba84b5fe.1167702.rsem.genes.results Error in download.file(url, method = method, ...) : cannot open destfile '../dataBRCA/unc.edu_BRCA.IlluminaHiSeq_RNASeqV2.Level_3.1.11.0/unc.edu.000d877f-8d03-44bc-8607-27b5ba84b5fe.1167702.rsem.genes.results', reason 'No such file or directory' >I see this error. If I
and then run
I get to
Enter a frame number, or 0 to exit 1: TCGAanalyze_Preprocessing(object = dataAssy, cor.cut = 0.6) 2: `colnames<-`(`*tmp*`, value = c("TCGA-E9-A1RD-11A-33R-A157-07", "TCGA-E9-A1 Selection:Entering '1' and exploring a bit, I seem I'm at the last several lines of the function
samplesCor <- rowMeans(c) objectWO <- assay(object,"raw_counts")[, samplesCor > cor.cut] colnames(objectWO) <- colData(object)$sampleand that the subset leading to
objectW0drops one columnBrowse[1]> table(samplesCor > cor.cut) FALSE TRUE 1 1210so the attempt to update the
colnames()has the wrong length; it should beI'm not sure why the example is not reproducible by @tiagochst; maybe the upstream files have changed, and the attempt to reproduce uses a cache?
I was impressed with the clear design of the package, especially that my download recovered rather than starting over!
> R.version[c("platform", "version.string")] _ platform x86_64-pc-linux-gnu version.string R version 3.2.3 Patched (2016-01-28 r70038)Yes, it's a bug. Using a cor.cut higher I could reproduce the error, but why it is not reproducible with the same cor.cut was quite strange.