TCGAbiolinks GDCquery Error: Parsing problems
2
1
Entering edit mode
Ramiro Magno ▴ 100
@ramiro-magno-12376
Last seen 5.5 years ago
CBMR, Faro, Portugal

Using TCGAbiolinks 2.5.9.

It seems that the GDC API might have changed something because

GDCquery("TCGA-BRCA")

Returns:

o GDCquery: Searching in GDC database

Genome of reference: hg38 Warning: 40 parsing failures. row # A tibble: 5 x 5 col row col expected actual file expected actual 1 1 8 columns 71 columns '[https://gdc-api.nci.nih.gov/projects?size=1000&format=tsv](https://gdc-api.nci.nih.gov/projects?size=1000&format=tsv)' file 2 2 8 columns 71 columns '[https://gdc-api.nci.nih.gov/projects?size=1000&format=tsv](https://gdc-api.nci.nih.gov/projects?size=1000&format=tsv)' row 3 3 8 columns 71 columns '[https://gdc-api.nci.nih.gov/projects?size=1000&format=tsv](https://gdc-api.nci.nih.gov/projects?size=1000&format=tsv)' col 4 4 8 columns 71 columns '[https://gdc-api.nci.nih.gov/projects?size=1000&format=tsv](https://gdc-api.nci.nih.gov/projects?size=1000&format=tsv)' expected 5 5 8 columns 71 columns '[https://gdc-api.nci.nih.gov/projects?size=1000&format=tsv](https://gdc-api.nci.nih.gov/projects?size=1000&format=tsv)' ... ................. ... .............................................................................................. ........ .............................................................................................. ...... .................. [... truncated]

  |disease_type.2              |disease_type.5                      |disease_type.4       |disease_type.6               |
  |:---------------------------|:-----------------------------------|:--------------------|:----------------------------|
  |NA                          |NA                                  |NA                   |NA                           |
  |NA                          |NA                                  |NA                   |NA                           |
  |NA                          |NA                                  |NA                   |NA                           |
  |NA                          |NA                                  |NA                   |NA                           |
  |NA                          |NA                                  |NA                   |NA                           |
  |NA                          |NA                                  |NA                   |NA                           |
  |NA                          |NA                                  |NA                   |NA                           |
  |NA                          |NA                                  |NA                   |NA                           |
  |NA                          |NA                                  |NA                   |NA                           |
  |NA                          |NA                                  |NA                   |NA                           |
  |NA                          |NA                                  |NA                   |NA                           |
  |NA                          |NA                                  |NA                   |NA                           |
  |NA                          |NA                                  |NA                   |NA                           |
  |Thymic Epithelial Neoplasms |Complex Mixed and Stromal Neoplasms |Basal Cell Neoplasms |Ductal and Lobular Neoplasms |
  |NA                          |NA                                  |NA                   |NA                           |
  |NA                          |NA                                  |NA                   |NA                           |
  |NA                          |NA                                  |NA                   |NA                           |
  |NA                          |NA                                  |NA                   |NA                           |
  |NA                          |NA                                  |NA                   |NA                           |
  |NA                          |NA                                  |NA                   |NA                           |
  |NA                          |NA                                  |NA                   |NA                           |
  |NA                          |NA                                  |NA                   |NA                           |
  |NA                          |NA                                  |NA                   |NA                           |
  |NA                          |NA                                  |NA                   |NA                           |
  |NA                          |NA                                  |NA                   |NA                           |
  |NA                          |NA                                  |NA                   |NA                           |
  |NA                          |NA                                  |NA                   |NA                           |
  |NA                          |NA                                  |NA                   |NA                           |
  |NA                          |NA                                  |NA                   |NA                           |
  |NA                          |NA                                  |NA                   |NA                           |
  |NA                          |NA                                  |NA                   |NA                           |
  |NA                          |NA                                  |NA                   |NA                           |
  |NA                          |NA                                  |NA                   |NA                           |
  |NA                          |NA                                  |NA                   |NA                           |
  |NA                          |NA                                  |NA                   |NA                           |
  |NA                          |NA                                  |NA                   |NA                           |
  |NA                          |NA                                  |NA                   |NA                           |
  |NA                          |NA                                  |NA                   |NA                           |
  |NA                          |NA                                  |NA                   |NA                           |
  |NA                          |NA                                  |NA                   |NA                           |

Error in checkProjectInput(project) : Please set a valid project argument from the column id above. Project TCGA-BRCA was not found.
In addition: Warning messages:
1: Unnamed col_types should have the same length as `col_names`. Using smaller of the two.
2: In rbind(names(probs), probs_f) : number of columns of result is not a multiple of vector length (arg 1)
3: Unknown or uninitialised column: 'project_id'.
4: Unknown or uninitialised column: 'project_id'.
tcgabiolinks • 2.6k views
ADD COMMENT
0
Entering edit mode

It seems the problem is in the function getGDCprojects in file R/internal.R.

Simplifying the code in getGDCprojects to download json only seems to solve the issue:

#' @title Retrieve all GDC projects
#' @description
#'   getGDCprojects uses the following api to get projects
#'   https://gdc-api.nci.nih.gov/projects
#' @export
#' @import readr stringr
#' @examples
#' projects <- getGDCprojects()
#' @return A data frame with last GDC projects
getGDCprojects <- function(){
  
  url <- "https://gdc-api.nci.nih.gov/projects?size=1000&format=json"
  json <- fromJSON(content(GET(url), as = "text", encoding = "UTF-8"), simplifyDataFrame = TRUE)
  projects <- json$data$hits
  projects$tumor <- unlist(lapply(projects$project_id, function(x){unlist(str_split(x,"-"))[2]}))
  return(projects)
}
ADD REPLY
4
Entering edit mode
maysa_taheir ▴ 50
@maysa_taheir-14275
Last seen 7.1 years ago

had the same problem and fixed it by installing TCGAbiolinks through the devtools:

devtools::install_github("BioinformaticsFMRP/TCGAbiolinks")

it didn't work with me from the first time as indicated by others in this group and ended up with 

(1) uninstalled R and deleted all of its associated libraries.  Then reinstalled it 

(2) I  separately installed GenomeInfoData package as TCGAbiolinks package installation was every time interrupted by prompting this msg "There is no package called 'GenomeInfoData" , so I used 

source("https://bioconductor.org/biocLite.R")
biocLite("GenomeInfoDbData")

(3) I re-run the devtools::install_github("BioinformaticsFMRP/TCGAbiolinks") and the problem was finally  fixed

ADD COMMENT
1
Entering edit mode
maysa_taheir ▴ 50
@maysa_taheir-14275
Last seen 7.1 years ago

I'm having the same problem, any solutions pls

 

ADD COMMENT

Login before adding your answer.

Traffic: 713 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6