Recount2 Bigwigs for TCGA
1
0
Entering edit mode
@alexvnesta-23044
Last seen 13 months ago

Hi,

I would like to obtain bigwig RNA-Seq coverage files for specific TCGA samples. Additionally, It would be great if I could download a specific locus from those bigwigs, and even average the coverage of multiple samples.

I believe this can all be achieved using the Recount2 Package, and associated packages in the QuickStart guide.

I simply cannot figure out how to download the TCGA bigwig files.

The download_study function in the recount QuickStart guide only explains how to download data using the SRA ID. I don't think TCGA has SRA IDs. If someone can help me get over the hump of importing the TCGA data as a RangedSummarizedExperiment simply by providing the TCGA ID, that would be great.

Thanks, Alex

R TCGA recount2 recount • 117 views
0
Entering edit mode

I found out how to download all of the TCGA bigwig files, but I want to just download bigwigs for certain TCGA IDs.

Here is what I have tried so far:

sapply(unique("TCGA"), download_study, type = 'samples')


Is there a way to specify TCGA IDs in the project argument?

0
Entering edit mode
Last seen 12 days ago
United States

Hi,

I don't know why I never got an email about your question despite the fact that you did use the recount tag. In any case, the answer lies in accessing the recount_url data.frame object provided by the recount package. You can subset it for TCGA and keep only the bigwig files, then use the resulting URLs. You might or mightnot want to download the mean_TCGA.bw file.

> library(recount)
> head(subset(recount_url, grepl('\\.bw$', file_name) & project == 'TCGA')) path 83847 /dcl01/leek/data/recount-website/mean/means_tcga/mean_TCGA.bw 83848 /dcl01/leek/data/tcga/v1/batch_29/coverage_bigwigs/3DFF72D2-F292-497E-ACE3-6FAA9C884205.bw 83849 /dcl01/leek/data/tcga/v1/batch_27/coverage_bigwigs/B1E54366-42B9-463C-8615-B34D52BD14DC.bw 83850 /dcl01/leek/data/tcga/v1/batch_14/coverage_bigwigs/473713F7-EB41-4F20-A37F-ACD209E3CB75.bw 83851 /dcl01/leek/data/tcga/v1/batch_22/coverage_bigwigs/11F18F54-9B33-4C33-BDF9-0F093F4F3336.bw 83852 /dcl01/leek/data/tcga/v1/batch_26/coverage_bigwigs/136B7576-1108-4FA3-8254-6069F0CA879A.bw file_name project version1 version2 83847 mean_TCGA.bw TCGA TRUE FALSE 83848 3DFF72D2-F292-497E-ACE3-6FAA9C884205.bw TCGA TRUE FALSE 83849 B1E54366-42B9-463C-8615-B34D52BD14DC.bw TCGA TRUE FALSE 83850 473713F7-EB41-4F20-A37F-ACD209E3CB75.bw TCGA TRUE FALSE 83851 11F18F54-9B33-4C33-BDF9-0F093F4F3336.bw TCGA TRUE FALSE 83852 136B7576-1108-4FA3-8254-6069F0CA879A.bw TCGA TRUE FALSE url 83847 http://duffel.rail.bio/recount/TCGA/bw/mean_TCGA.bw 83848 http://duffel.rail.bio/recount/TCGA/bw/3DFF72D2-F292-497E-ACE3-6FAA9C884205.bw 83849 http://duffel.rail.bio/recount/TCGA/bw/B1E54366-42B9-463C-8615-B34D52BD14DC.bw 83850 http://duffel.rail.bio/recount/TCGA/bw/473713F7-EB41-4F20-A37F-ACD209E3CB75.bw 83851 http://duffel.rail.bio/recount/TCGA/bw/11F18F54-9B33-4C33-BDF9-0F093F4F3336.bw 83852 http://duffel.rail.bio/recount/TCGA/bw/136B7576-1108-4FA3-8254-6069F0CA879A.bw > dim(subset(recount_url, grepl('\\.bw$', file_name) & project == 'TCGA'))
[1] 11285     6
> packageVersion('recount')
[1] ‘1.12.1’


Best, Leonardo