Question: TCGAbiolinks GDCprepare fails to connect to Biomart web server
0
gravatar for sabarinath.chandrasekharan
2.6 years ago by

Hi,

 

I am trying to work with TCGAbiolinks, but am having problem with GDCprepare. It fails to connect to BioMart web service. 

 

​queryGBM <- GDCquery(project = "TCGA-GBM",
+                   data.category = "Gene expression",
+                   data.type = "Gene expression quantification",
+                   platform = "Illumina HiSeq", file.type  = "normalized_results",
+                   experimental.strategy = "RNA-Seq",
+                   barcode = c("TCGA-14-0736-02A-01R-2005-01", "TCGA-06-0211-02A-02R-2005-01"),
+                   legacy = TRUE)
Accessing GDC. This might take a while...
> GDCdownload(queryGBM)
All samples have been already downloded
> data <- GDCprepare(queryGBM)
  |============================================================================================| 100%
Downloading genome information. Using: Homo sapiens genes (GRCh37.p13)
Error in value[[3L]](cond) : 
  Request to BioMart web service failed. Verify if you are still connected to the internet.  Alternatively the BioMart web service is temporarily down.

 

But I am able access the BioMart web service from biomart 

 

>entrez=c("673","837")
> goids = getBM(attributes=c('entrezgene','go_id'), filters='entrezgene', values=entrez, mart=ensembl)
> head(goids)
  entrezgene      go_id
1        673           
2        673 GO:0005737
3        673 GO:0005886
4        673 GO:0005634
5        673 GO:0005829
6        673 GO:0005509

What could be the issue?

 

Thanks and Regards,

Sabari

 

 

tcgabiolinks gdcprepare • 758 views
ADD COMMENTlink modified 2.5 years ago • written 2.6 years ago by sabarinath.chandrasekharan10
Answer: TCGAbiolinks GDCprepare fails to connect to Biomart web server
0
gravatar for Tiago Chedraoui Silva
2.6 years ago by
Brazil - University of São Paulo/ Los Angeles - Cedars-Sinai Medical Center
Tiago Chedraoui Silva170 wrote:

Hi,

As you used legacy = TRUE, TCGAbiolinks will access Ensembl75 (hg19/GRCh37) . One of this servers might have been temporarily down.

Ensembl75 can be accessed with one of these codes (source: https://www.biostars.org/p/136775/):

grch37 = useMart(biomart="ENSEMBL_MART_ENSEMBL", host="grch37.ensembl.org", path="/biomart/martservice", dataset="hsapiens_gene_ensembl")

or 

ensembl_75 = useMart(biomart="ENSEMBL_MART_ENSEMBL", host="feb2014.archive.ensembl.org", path="/biomart/martservice", dataset="hsapiens_gene_ensembl")

These sources are used in TCGAbiolinks for Ensembl75 (hg19/GRCh37).

https://github.com/BioinformaticsFMRP/TCGAbiolinks/blob/0fa5099c1c9d0d1bdab9365146e769af72c7c54e/R/TCGAPrepare.R#L502-L531

ADD COMMENTlink written 2.6 years ago by Tiago Chedraoui Silva170

Still no luck.

> grch37 = useMart(biomart="ENSEMBL_MART_ENSEMBL", host="grch37.ensembl.org", path="/biomart/martservice", dataset="hsapiens_gene_ensembl")
> ensembl_75 = useMart(biomart="ENSEMBL_MART_ENSEMBL", host="feb2014.archive.ensembl.org", path="/biomart/martservice", dataset="hsapiens_gene_ensembl")
> GDCdownload(queryGBM)
All samples have been already downloded
> GDCprepare(queryGBM)
  |============================================================================================| 100%
Downloading genome information. Using: Homo sapiens genes (GRCh37.p13)
Error in value[[3L]](cond) : 
  Request to BioMart web service failed. Verify if you are still connected to the internet.  Alternatively the BioMart web service is temporarily down.
ADD REPLYlink written 2.6 years ago by sabarinath.chandrasekharan10
Answer: TCGAbiolinks GDCprepare fails to connect to Biomart web server
0
gravatar for sabarinath.chandrasekharan
2.6 years ago by

GDCprepare throwed a different error when I tried with a different data set

> query <- GDCquery(project = "TCGA-LUSC", data.category = "Gene expression", data.type = "Gene Expression Quantification", platform = "Illumina HiSeq", file.type  = "normalized_results", experimental.strategy = "RNA-Seq", sample.type = c("Primary solid Tumor"),  barcode = c("TCGA-85-8481-01A-11R-2326-07"," TCGA-56-8626-01A-11R-2403-07 "), legacy = TRUE)
Accessing GDC. This might take a while...

> GDCdownload(query)
All samples have been already downloded

> data <- GDCprepare(query)
Error in names(frame)[names(frame) == "x"] <- name : 
  names() applied to a non-vector

> data <- GDCprepare(query)

> data
function (..., list = character(), package = NULL, lib.loc = NULL, 
    verbose = getOption("verbose"), envir = .GlobalEnv) 
{
    fileExt <- function(x) {
        db <- grepl("\\.[^.]+\\.(gz|bz2|xz)$", x)
        ans <- sub(".*\\.", "", x)
        ans[db] <- sub(".*\\.([^.]+\\.)(gz|bz2|xz)$", "\\1\\2", 
            x[db])
        ans
    }
........................
    REST OF THE CODE HERE , REMOVED DUE TO WORD COUNT CONSTRAINT IN POSTING
........................
                  }
                  if (found) 
                    break
                }
                if (verbose) 
                  message(if (!found) 
                    "*NOT* ", "found", domain = NA)
            }
            if (found) 
                break
        }
        if (!found) 
            warning(gettextf("data set %s not found", sQuote(name)), 
                domain = NA)
    }
    invisible(names)
}
<bytecode: 0x0000000012c81af0>
<environment: namespace:utils>
> 

Is this a problem with accessing the API or is there any problem with my query structure?

 

Thanks,

Sabari

 

ADD COMMENTlink modified 2.6 years ago • written 2.6 years ago by sabarinath.chandrasekharan10

Could you send me the sessionInfo() from R ?

Also, it this the last version of the package?

ADD REPLYlink written 2.5 years ago by tiagochst110

Sorry about the delay: 

Here is the SessionInfo as well

> query_GBM <- GDCquery(project = "TCGA-GBM",
+                   data.category = "Gene expression",
+                   data.type = "Gene expression quantification",
+                   platform = "Illumina HiSeq", file.type  = "normalized_results",
+                   experimental.strategy = "RNA-Seq",
+                   barcode = c("TCGA-14-0736-02A-01R-2005-01", "TCGA-06-0211-02A-02R-2005-01"),
+                   legacy = TRUE)
Accessing GDC. This might take a while...
> GDCdownload(query_GBM)
Of the 2 files for download 2 already exist.
All samples have been already downloaded
> z <- GDCprepare(query_GBM)
  |============================================================================================| 100%
Downloading genome information. Using: Homo sapiens genes (GRCh37.p13)
Error in value[[3L]](cond) : 
  Request to BioMart web service failed. Verify if you are still connected to the internet.  Alternatively the BioMart web service is temporarily down.
> 
> sessionInfo()
R version 3.3.1 (2016-06-21)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 7 x64 (build 7601) Service Pack 1

locale:
[1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United States.1252   
[3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C                          
[5] LC_TIME=English_United States.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] TCGAbiolinks_2.0.13

loaded via a namespace (and not attached):
  [1] TH.data_1.0-7                           colorspace_1.2-6                       
  [3] rjson_0.2.15                            hwriter_1.3.2                          
  [5] class_7.3-14                            modeltools_0.2-21                      
  [7] mclust_5.2                              circlize_0.3.9                         
  [9] XVector_0.12.1                          GenomicRanges_1.24.3                   
 [11] GlobalOptions_0.0.10           
 [.]                          
                          
[129] munsell_0.4.3                         

>

ADD REPLYlink written 2.5 years ago by sabarinath.chandrasekharan10
Answer: TCGAbiolinks GDCprepare fails to connect to Biomart web server
0
gravatar for sabarinath.chandrasekharan
2.5 years ago by

I am still having this problem of GDCprepare not being able to connecto Biomart server, while other programs can. anybody else is facing this problem? Is there any work around?

ADD COMMENTlink written 2.5 years ago by sabarinath.chandrasekharan10
Does the code below works?

ADD REPLYlink written 2.5 years ago by tiagochst110

Sorry this code also does not work.

> hg19 <- get.GRCh.bioMart()
Downloading genome information. Using: Homo sapiens genes (GRCh37.p13)
 Show Traceback
 
 Rerun with Debug
 Error in value[[3L]](cond) : 
  Request to BioMart web service failed. Verify if you are still connected to the internet.  Alternatively the BioMart web service is temporarily down. > hg38 <- get.GRCh.bioMart("hg38")
Downloading genome information. Using: Homo sapiens genes (GRCh38.p7)
 Show Traceback
 
 Rerun with Debug
 Error in value[[3L]](cond) : 
  Request to BioMart web service failed. Verify if you are still connected to the internet.  Alternatively the BioMart web service is temporarily down. > 
> # Test 2: default server
> ensembl <- useMart(biomart = "ENSEMBL_MART_ENSEMBL",
+                    dataset = "hsapiens_gene_ensembl")
> attributes <- c("chromosome_name",
+                 "start_position",
+                 "end_position", "strand",
+                 "ensembl_gene_id", "entrezgene",
+                 "external_gene_id")
> chrom <- c(1:22, "X", "Y")
> gene.location <- getBM(attributes = attributes,
+                        filters = c("chromosome_name"),
+                        values = list(chrom), mart = ensembl)
Error in getBM(attributes = attributes, filters = c("chromosome_name"),  : 
  Invalid attribute(s): external_gene_id 
Please use the function 'listAttributes' to get valid attribute names
> 

 

ADD REPLYlink written 2.5 years ago by sabarinath.chandrasekharan10
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 87 users visited in the last hour