TCGAbiolinks not downloading the 450k data
1
0
Entering edit mode
tangming2005 ▴ 200
@tangming2005-6754
Last seen 11 weeks ago
United States

Hi,

I know cghub is down and TCGAbiolinks get changed a lot. It says in the manual it is working with 450k data, but it is not.

 

query.GBM<- GDCquery(project = "TCGA-GBM",
                      data.category = "DNA methylation", 
                      platform = "Illumina Human Methylation 450", 
                      legacy = TRUE)

str(query.GBM$results)

## download data
GDCdownload(query.GBM, directory = "GBM_meth")

GDCdownload will download: 3.300125655 GB compressed in a tar.gz file
Downloading as: Tue_Sep_20_16_31_55_2016.tar.gz
Error in file(con, "rb") : cannot open the connection
In addition: Warning message:
In file(con, "rb") :
  cannot open file 'Tue_Sep_20_16_31_55_2016.tar.gz': No such file or directory

The file is missing somehow?

Thanks!

Ming

tcgabiolinks • 2.1k views
ADD COMMENT
0
Entering edit mode
@tiago-chedraoui-silva-8877
Last seen 4.3 years ago
Brazil - University of São Paulo/ Los A…

Hi,

Your code is right. There might be an error connection with  the GDC API. Unfortunately, this problem happens very often.

Maybe using client method or downloading less samples at a time will solve the problem.

# For client method

GDCdownload(query_methy, method = "client", directory = "GBM_meth")

# For less sampleslibrary(TCGAbiolinks)

query.GBM<- GDCquery(project = "TCGA-GBM",
                     data.category = "DNA methylation", 
                     platform = "Illumina Human Methylation 450", 
                     legacy = TRUE)

n <- nrow(query.GBM$results[[1]])
step <- 5
for(i in 0:(n/step)){
    end <- ifelse(((i + 1) * step) > n, n,((i + 1) * step))
    query_methy.aux <- query.GBM
    query_methy.aux$results[[1]] <- query_methy.aux$results[[1]][((i * step) + 1):end,]
    GDCdownload(query_methy.aux, method = "api", directory = "GBM_meth")
}
GDCdownload(query.GBM, method = "api", directory = "GBM_meth") # just to be sure everything was downloaded
met <- GDCprepare(query = query_methy,
                  save = TRUE,
                  save.filename = paste0(cancer,"_DNAmethylaltion_450k.rda"),
                  summarizedExperiment = TRUE)

If the problem continues, could please you send your sessionInfo() ?

Best regards,

Tiago

ADD COMMENT
0
Entering edit mode

thanks, I am using `client` instead

ADD REPLY
0
Entering edit mode

if I want to put two projects together, is it possible?

 

query<- GDCquery(project = c("TCGA-GBM", "TCGA-LGG"),
                      data.category = "DNA methylation", 
                      platform = "Illumina Human Methylation 450", 
                      legacy = TRUE)

and prepare the data together as well?

Thanks.

ADD REPLY
0
Entering edit mode

No, but you can cbind them.

query.lgg <- GDCquery(project = "TCGA-LGG",
                      data.category = "DNA methylation",
                      platform = "Illumina Human Methylation 450",
                      legacy = TRUE)
GDCdownload(query.lgg)
met.lgg <-GDCprepare(query.lgg, save = FALSE)

query.gbm <- GDCquery(project = "TCGA-GBM",
                      data.category = "DNA methylation",
                      platform = "Illumina Human Methylation 450",
                      legacy = TRUE)
GDCdownload(query.gbm)
met.gbm <- GDCprepare(query.gbm, save = FALSE)
met.lgg.gbm <- SummarizedExperiment::cbind(met.lgg, met.gbm)
ADD REPLY
0
Entering edit mode

Ok, we added an option to download by chunks.

query.GBM<- GDCquery(project = "TCGA-GBM",

                     data.category = "DNA methylation", 
                     platform = "Illumina Human Methylation 450", 
                     legacy = TRUE)
GDCdownload(query_methy.aux, method = "api", directory = "GBM_meth" ,chunks.per.download = 10)

If it crashes if you re-execute, it will download what is missing.

ADD REPLY
0
Entering edit mode

The GDCprepare is not working as well.

I checked biomart is up.

GBM.meth <- GDCprepare(query = query.GBM.meth,

+                   save = TRUE,

+                   save.filename = "GBM_DNAmethylaltion_450k.rda",

+                   summarizedExperiment = TRUE)

  |======================================================================| 100%Request to BioMart web service failed. Verify if you are still connected to the internet.  Alternatively the BioMart web service is temporarily down.  Check http://www.biomart.org and verify if this website is available.

Error in textConnection(attrfilt) : invalid 'text' argument
ADD REPLY
0
Entering edit mode

Is it still failing?

This might be related to this: TCGAbiolinks GDCprepare fails to connect to Biomart web server

 

ADD REPLY

Login before adding your answer.

Traffic: 1058 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6