TCGA - Different Sample Number
1
0
Entering edit mode
iedenkim • 0
@iedenkim-12391
Last seen 7.1 years ago

I downloaded count dataset for Esophugus tissues from ReCount and found it contains 198 samples, which is different from what I got from TCGA web-site. Actually I could downloaded 185 samples from the TCGA site. Can you let me know where this difference is came from? 

By the way, ReCount DB is super convenient. 

recount • 2.1k views
ADD COMMENT
0
Entering edit mode

Hi,

Are you using the recount Bioconductor package? If so, please follow the posting guidelines http://www.bioconductor.org/help/support/posting-guide/. Specifically, post some reproducible code and session information.

If you are using the recount website, can you post the link of the file you downloaded?

Finally, can you provide reproducible information for how you got 185 samples from the TCGA website?

Thanks,

Leonardo

 

ADD REPLY
0
Entering edit mode

Hi Leonardo,

Thank you for your reply. I downloaded the file from the recount website and the link that I downloaded is HERE

Also the number of 185 samples was checked and downloaded from the Firebrowse and GDC Data portal in TCGA.

Thanks,

Jungsoo

ADD REPLY
2
Entering edit mode
sellis18 ▴ 20
@sellis18-12319
Last seen 7.2 years ago

Hi Jungsoo,

Leo pointed me in the direction of your question. I'm not positive where/how Firebrowse filtered their data (as I'm less familiar with that resource and didn't look into it) to reach 185 samples; however, I think I may have found the answer to your question. It looks as though the 185 samples are tumor type samples, while the other 13 are 'normal types' (see code below). TCGA barcode explanations can be further explored here.

 


## download metadata for recount TCGA data
recount::all_metadata('TCGA') -> md 

## just look at the Esophageal samples
md_sub = subset(md,md$xml_primary_pathology_tumor_tissue_site=="Esophagus")

## take a look at sample type information
# Tumor types range from 01 - 09, normal types from 10 - 19 and control samples from 20 - 29
table(md_sub$gdc_cases.samples.sample_type_id)
 01  06  11
184   1  13

Let me (er...us) know if anything is unclear or you have further questions!

Shannon

 

ADD COMMENT
0
Entering edit mode

Hi Shannon,

Your explanation is just perfect and the question is cleared. Thank you.

Jungsoo

ADD REPLY

Login before adding your answer.

Traffic: 572 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6