How can I download all bigwig files for TCGA samples? I noticed an answer was provided previously for recount2 but it doesn’t work for recount3:
Thanks!
How can I download all bigwig files for TCGA samples? I noticed an answer was provided previously for recount2 but it doesn’t work for recount3:
Thanks!
Hi,
Thank you for your interest in recount3
and recount2
. The easiest option in recount3
to find the URLs for BigWig files is to use the recount3::create_rse()
function which will include a colData()
column called BigWigURL
as shown at https://github.com/LieberInstitute/recount3/issues/21#issuecomment-1074156958. Here's a short extract:
as.data.frame(colData(rse)[1, c("external_id", "study", "BigWigURL")])
#> external_id study
#> GTEX-T6MN-0011-R1A-SM-32QOY.1 GTEX-T6MN-0011-R1A-SM-32QOY.1 BRAIN
#> BigWigURL
#> GTEX-T6MN-0011-R1A-SM-32QOY.1 http://duffel.rail.bio/recount3/human/data_sources/gtex/base_sums/IN/BRAIN/OY/gtex.base_sums.BRAIN_GTEX-T6MN-0011-R1A-SM-32QOY.1.ALL.bw
You could also use recount3::locate_url()
, however as noted at https://github.com/LieberInstitute/recount3/issues/21#issuecomment-1074156958, that function doesn't guarantee that the result is a valid URL due to programmatic reasons from the data host side (IDIES at JHU).
Using recount3::create_rse()
at the gene level might be a bit too much data to download for a large project such as TCGA (which is split by tissue as is GTEx), so you might prefer to dive into the internal code of recount3::create_rse_manual()
and re-use it https://github.com/LieberInstitute/recount3/blob/6eb14b844062ebdf45fe5a356577e3ea0483c97e/R/create_rse_manual.R#L156-L165 after downloading the TCGA metadata files.
As you can see, there are a few different options, with different degrees of complexity.
Once you have located the URLs, you can use recount3::file_retrieve()
which uses internally BiocFileCache::bfcrpath()
https://github.com/LieberInstitute/recount3/blob/6eb14b844062ebdf45fe5a356577e3ea0483c97e/R/file_retrieve.R#L80 or download them through some other way including recount::download_retry()
which uses internally downloader::download()
https://github.com/leekgroup/recount/blob/10f29f9d44906f798aa3a7655ae40ac269c36ae5/R/download_retry.R#L39.
Best, Leo
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.