Question: Getting the metadata for a RNA-seq sample from TCGA when you have the uuid
2
gravatar for Leonardo Collado Torres
2.8 years ago by
United States
Leonardo Collado Torres650 wrote:

Hi,

I have a bunch of uuid's from TCGA RNA-seq samples and would like to get the metadata for them. Apparently you can get some basic info by going to https://gdc-portal.nci.nih.gov/search/c?filters=%7B%22op%22:%22and%22,%22content%22:%5B%7B%22op%22:%22in%22,%22content%22:%7B%22field%22:%22cases.project.program.name%22,%22value%22:%5B%22TCGA%22%5D%7D%7D%5D%7D&facetTab=cases and clicking on export, but that has failed for me the past two days. If you follow the links for one sample, you can end up at https://gdc-portal.nci.nih.gov/cases/0004d251-3f70-4395-b175-c94c2f5b1b81 where you can download the clinical info for a sample.

I've looked around at TCGAbiolinks and couldn't find a way to query GDC when I have a uuid such as FFA5FFF7-6301-4CD8-8E63-A4D8294D1B0E. Is there a way to do so with TCGAbiolinks? If not, how do you suggest I should proceed?

Thank you,

Leo

> packageVersion('TCGAbiolinks')
[1] ‘2.2.1’

tcgabiolinks tcgadownload • 2.3k views
ADD COMMENTlink modified 2.8 years ago by Tiago Chedraoui Silva240 • written 2.8 years ago by Leonardo Collado Torres650
Answer: Getting the metadata for a RNA-seq sample from TCGA when you have the uuid
1
gravatar for Tiago Chedraoui Silva
2.8 years ago by
Brazil - University of São Paulo/ Los Angeles - Cedars-Sinai Medical Center
Tiago Chedraoui Silva240 wrote:

Hi,

No, we don't have this functionality in the package (we had it with TCGA, but they removed the API that mapped BARCODE/UUID). 

However I coouldn't find the FFA5FFF7-6301-4CD8-8E63-A4D8294D1B0E UUID neither in the legacy nor harmonized database, in which field do you look ?

Best regards,

Tiago

ADD COMMENTlink written 2.8 years ago by Tiago Chedraoui Silva240

Hi Tiago,

The uuid I have is from https://github.com/nellore/runs/blob/105c86de2ef91846f015f5b8285a7d6e29e0fcfc/tcga/tcga_batch_0.manifest#L236 which was created with https://github.com/nellore/runs/blob/105c86de2ef91846f015f5b8285a7d6e29e0fcfc/tcga/true_manifest.py that uses the output from https://github.com/nellore/runs/blob/105c86de2ef91846f015f5b8285a7d6e29e0fcfc/tcga/tcga_file_list.py.

And hm... it's a shame that the BARCODE/UUID api no longer exists. I'm guessing that you are talking about https://wiki.nci.nih.gov/display/TCGA/TCGA+Barcode+to+UUID+Web+Service+User%27s+Guide, right?

Best,

Leo

ADD REPLYlink written 2.8 years ago by Leonardo Collado Torres650

Yes, that is the old API. It is not working anymore.

For the  FFA5FFF7-6301-4CD8-8E63-A4D8294D1B0E If you have this

https://github.com/nellore/runs/blob/105c86de2ef91846f015f5b8285a7d6e29e0fcfc/tcga/tcga_batch_0.manifest#L236

You can use this to get to the file (which is the submitter_id) and search in GDC. 

/Datasets/tcga/TCGA-COAD/28033279-cc74-4775-afdf-2497f6ddb55c/analysis/154aa297-0890-4fde-a8c1-2058a4c65b28/data/UNCID_2212217.4a01323f-408b-4e74-8686-ee6d4d076ee8.110302_UNC6-RDR300211_00066_FC_62J5EAAXX_3.tar.gz 0 FFA5FFF7-6301-4CD8-8E63-A4D8294D1B0E

There is no function to map UUID to BARCODE in TCGAbiolinks, but as they mapped the UUID to the file id. We could create a table, but I believe that is too much work. Did you send an email to GDC team (https://gdc.cancer.gov/contact-us) they might have a solution?

 

ADD REPLYlink written 2.8 years ago by Tiago Chedraoui Silva240

I was able to create a function to map to barcode, map that helps you.

What type of metadata do you want?

ADD REPLYlink written 2.8 years ago by Tiago Chedraoui Silva240

Awesome! Thanks!

I'm not super familiar with TCGA, but well, basically we would like to get all the metadata associated with a given RNA-seq sample. That is, information about the person (clinical?) and the RNA-seq sample itself if there is any. Is there other information you think might be useful?

ADD REPLYlink written 2.8 years ago by Leonardo Collado Torres650

Actually, you are getting all that is available, but there are some mark papers that have already make some studies on some samples. Maybe you can use it.

 

ADD REPLYlink written 2.8 years ago by Tiago Chedraoui Silva240

Thanks for the help... Just to update for others who may need this, the following line of code has changed from line 8 to:

baseURL <- ifelse(legacy,"https://api.gdc.cancer.gov/legacy/files/?","https://api.gdc.cancer.gov/files/?")

* EDIT * this code currently does not accurately translate legacy UUIDs to barcodes. I manually checked using the GDC legacy archive. Please use the code explained in Sean Davis' blog (https://seandavi.github.io/2017/12/genomicdatacommons-example-uuid-to-tcga-and-target-barcode-translation/) for accurate translation of legacy IDs to barcodes.

ADD REPLYlink modified 5 months ago • written 5 months ago by david.peeney0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 275 users visited in the last hour