Search
Question: TCGA - Different Sample Number
0
gravatar for iedenkim
8 months ago by
iedenkim0
iedenkim0 wrote:

I downloaded count dataset for Esophugus tissues from ReCount and found it contains 198 samples, which is different from what I got from TCGA web-site. Actually I could downloaded 185 samples from the TCGA site. Can you let me know where this difference is came from? 

By the way, ReCount DB is super convenient. 

ADD COMMENTlink modified 8 months ago by sellis1820 • written 8 months ago by iedenkim0

Hi,

Are you using the recount Bioconductor package? If so, please follow the posting guidelines http://www.bioconductor.org/help/support/posting-guide/. Specifically, post some reproducible code and session information.

If you are using the recount website, can you post the link of the file you downloaded?

Finally, can you provide reproducible information for how you got 185 samples from the TCGA website?

Thanks,

Leonardo

 

ADD REPLYlink written 8 months ago by Leonardo Collado Torres560

Hi Leonardo,

Thank you for your reply. I downloaded the file from the recount website and the link that I downloaded is HERE

Also the number of 185 samples was checked and downloaded from the Firebrowse and GDC Data portal in TCGA.

Thanks,

Jungsoo

ADD REPLYlink written 8 months ago by iedenkim0
2
gravatar for sellis18
8 months ago by
sellis1820
sellis1820 wrote:

Hi Jungsoo,

Leo pointed me in the direction of your question. I'm not positive where/how Firebrowse filtered their data (as I'm less familiar with that resource and didn't look into it) to reach 185 samples; however, I think I may have found the answer to your question. It looks as though the 185 samples are tumor type samples, while the other 13 are 'normal types' (see code below). TCGA barcode explanations can be further explored here.

 


## download metadata for recount TCGA data
recount::all_metadata('TCGA') -> md 

## just look at the Esophageal samples
md_sub = subset(md,md$xml_primary_pathology_tumor_tissue_site=="Esophagus")

## take a look at sample type information
# Tumor types range from 01 - 09, normal types from 10 - 19 and control samples from 20 - 29
table(md_sub$gdc_cases.samples.sample_type_id)
 01  06  11
184   1  13

Let me (er...us) know if anything is unclear or you have further questions!

Shannon

 

ADD COMMENTlink modified 8 months ago • written 8 months ago by sellis1820

Hi Shannon,

Your explanation is just perfect and the question is cleared. Thank you.

Jungsoo

ADD REPLYlink written 8 months ago by iedenkim0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.2.0
Traffic: 259 users visited in the last hour