Question: Which one to pick if there are duplicate TCGA samples?
gravatar for Biologist
19 months ago by
Biologist70 wrote:

Hi @Sean Davis and @Martin Morgan,

I have downloaded TCGA BRCA raw sequencing data (fastq's) from gdc legacy. A total of 1256 cases. I have the UUID names for them. So, I used "Genomics Data commons" package to convert UUID to TCGA-Barcode. But after doing this conversion I see that there are duplicate cases.

UUID samplenames
5516dd59-3d95-4bc6-84e7-5719b1bbcabf TCGA-A7-A26F-01B
a907f2d1-92ad-4a1b-b439-20e5a7347d5b TCGA-A7-A26F-01A
b570a72f-5e6c-4301-923b-9992662409ca TCGA-A7-A26F-01B
ba22d7e6-3e70-4a43-9dc1-59069b39e8c2 TCGA-A7-A26F-01B
eb068925-2dcc-4e18-838f-903ac8d2b661 TCGA-A7-A26F-01A

As you see for "TCGA-A7-A26F-01B" I see three UUID's and for "TCGA-A7-A26F-01A" I see two UUID's. 


1) Is there a way to get the whole TCGA-Barcode  with aliquot like "TCGA-A7-A26F-01A-21R-A169-07" from UUID's ? So, that based on Analyte or plate number I can choose the sample.

2) From I downloaded "gdac.broadinstitute.org_BRCA.Merge_rnaseqv2__illuminahiseq_rnaseqv2__unc_edu__Level_3__RSEM_genes_normalized__data.Level_3.2016012800.0.0.tar.gz" and I see that it has 1212 samples. Among that I see there is only one case with "TCGA-A7-A26F" which is "TCGA-A7-A26F-01A-21R-A169-07"

3) How to get whether sample is FFPE or not?

4) What to do when two UUID's having same TCGA-barcode with same aliquot like below:


Thank you

rnaseq R tcga tcgabiolinks gdc • 377 views
ADD COMMENTlink modified 19 months ago • written 19 months ago by Biologist70

@Sean Davis and @Martin Morgan Could you please help me in this?

Thank you 

ADD REPLYlink written 19 months ago by Biologist70
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 224 users visited in the last hour