Hi @Sean Davis and @Martin Morgan,
I have downloaded TCGA BRCA raw sequencing data (fastq's) from gdc legacy. A total of 1256 cases. I have the UUID names for them. So, I used "Genomics Data commons" package to convert UUID to TCGA-Barcode. But after doing this conversion I see that there are duplicate cases.
UUID |
samplenames |
5516dd59-3d95-4bc6-84e7-5719b1bbcabf |
TCGA-A7-A26F-01B |
a907f2d1-92ad-4a1b-b439-20e5a7347d5b |
TCGA-A7-A26F-01A |
b570a72f-5e6c-4301-923b-9992662409ca |
TCGA-A7-A26F-01B |
ba22d7e6-3e70-4a43-9dc1-59069b39e8c2 |
TCGA-A7-A26F-01B |
eb068925-2dcc-4e18-838f-903ac8d2b661 |
TCGA-A7-A26F-01A |
As you see for "TCGA-A7-A26F-01B" I see three UUID's and for "TCGA-A7-A26F-01A" I see two UUID's.
Questions:
1) Is there a way to get the whole TCGA-Barcode with aliquot like "TCGA-A7-A26F-01A-21R-A169-07" from UUID's ? So, that based on Analyte or plate number I can choose the sample.
2) From Firebrowse.org I downloaded "gdac.broadinstitute.org_BRCA.Merge_rnaseqv2__illuminahiseq_rnaseqv2__unc_edu__Level_3__RSEM_genes_normalized__data.Level_3.2016012800.0.0.tar.gz" and I see that it has 1212 samples. Among that I see there is only one case with "TCGA-A7-A26F" which is "TCGA-A7-A26F-01A-21R-A169-07"
3) How to get whether sample is FFPE or not?
4) What to do when two UUID's having same TCGA-barcode with same aliquot like below:
eb068925-2dcc-4e18-838f-903ac8d2b661 |
TCGA-A7-A26F-01A-21R-A169-07 |
a907f2d1-92ad-4a1b-b439-20e5a7347d5b |
TCGA-A7-A26F-01A-21R-A169-07 |
Thank you