Closed:Which one to pick if there are duplicate TCGA samples?
0
0
Entering edit mode
Biologist ▴ 110
@biologist-9801
Last seen 4.1 years ago

Hi @Sean Davis and @Martin Morgan,

I have downloaded TCGA BRCA raw sequencing data (fastq's) from gdc legacy. A total of 1256 cases. I have the UUID names for them. So, I used "Genomics Data commons" package to convert UUID to TCGA-Barcode. But after doing this conversion I see that there are duplicate cases.

UUID samplenames
5516dd59-3d95-4bc6-84e7-5719b1bbcabf TCGA-A7-A26F-01B
a907f2d1-92ad-4a1b-b439-20e5a7347d5b TCGA-A7-A26F-01A
b570a72f-5e6c-4301-923b-9992662409ca TCGA-A7-A26F-01B
ba22d7e6-3e70-4a43-9dc1-59069b39e8c2 TCGA-A7-A26F-01B
eb068925-2dcc-4e18-838f-903ac8d2b661 TCGA-A7-A26F-01A

As you see for "TCGA-A7-A26F-01B" I see three UUID's and for "TCGA-A7-A26F-01A" I see two UUID's. 

Questions:

1) Is there a way to get the whole TCGA-Barcode  with aliquot like "TCGA-A7-A26F-01A-21R-A169-07" from UUID's ? So, that based on Analyte or plate number I can choose the sample.

2) From Firebrowse.org I downloaded "gdac.broadinstitute.org_BRCA.Merge_rnaseqv2__illuminahiseq_rnaseqv2__unc_edu__Level_3__RSEM_genes_normalized__data.Level_3.2016012800.0.0.tar.gz" and I see that it has 1212 samples. Among that I see there is only one case with "TCGA-A7-A26F" which is "TCGA-A7-A26F-01A-21R-A169-07"

3) How to get whether sample is FFPE or not?

4) What to do when two UUID's having same TCGA-barcode with same aliquot like below:

eb068925-2dcc-4e18-838f-903ac8d2b661
TCGA-A7-A26F-01A-21R-A169-07
a907f2d1-92ad-4a1b-b439-20e5a7347d5b
TCGA-A7-A26F-01A-21R-A169-07

Thank you

rnaseq tcga gdc tcgabiolinks r • 495 views
ADD COMMENT
This thread is not open. No new answers may be added
Traffic: 856 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6