Question: TCGAbiolinks OutPuts (5% Losing Information)
0
gravatar for dydrkq777
16 months ago by
dydrkq7770
dydrkq7770 wrote:

 

Hi, i'm Yonggab.

I downloaded the TCGA Ovarian Cancer Data using TCGAbiolinks package.

(Transcriptome profiling / gene expression quantification / HTSeq-Count)

But, I found out that 5% of the data was filtered (losing).

Can you tell me why it was filtered?

And is there way to see this in R? ( I have whole gene set )

 

 

tcgabiolinks • 218 views
ADD COMMENTlink modified 16 months ago • written 16 months ago by dydrkq7770

Hello,

To create a summarizedExperiemnt object in TCGAbiolinks we have to map genes, probes etc into genomic coordinates. We decide to map those to the most updated ENSEMBL patched version. For humans those versions are GRCh38.p12 (hg38) and GRCh19.p13 (hg19). Since the version we are using, some of the genes were retired (https://www.ensembl.org/Homo_sapiens/Gene/Idhistory?g=ENSG00000234536, https://www.ensembl.org/Homo_sapiens/Gene/Idhistory?g=ENSG00000230939, https://www.ensembl.org/Homo_sapiens/Gene/Idhistory?g=ENSG00000069712).

 

You can get an unmodified version if you set the argument SummarizedExperiment to FALSE (there is an example here http://rpubs.com/tiagochst/five_5_percent_doubt with a table of the genes we could not map in the most updated version)

 

 

ADD REPLYlink written 16 months ago by Tiago Chedraoui Silva240
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 154 users visited in the last hour