TCGAbiolinks OutPuts (5% Losing Information)
0
0
Entering edit mode
dydrkq777 • 0
@dydrkq777-16597
Last seen 6.4 years ago

 

Hi, i'm Yonggab.

I downloaded the TCGA Ovarian Cancer Data using TCGAbiolinks package.

(Transcriptome profiling / gene expression quantification / HTSeq-Count)

But, I found out that 5% of the data was filtered (losing).

Can you tell me why it was filtered?

And is there way to see this in R? ( I have whole gene set )

 

 

tcgabiolinks • 741 views
ADD COMMENT
0
Entering edit mode

Hello,

To create a summarizedExperiemnt object in TCGAbiolinks we have to map genes, probes etc into genomic coordinates. We decide to map those to the most updated ENSEMBL patched version. For humans those versions are GRCh38.p12 (hg38) and GRCh19.p13 (hg19). Since the version we are using, some of the genes were retired (https://www.ensembl.org/Homo_sapiens/Gene/Idhistory?g=ENSG00000234536, https://www.ensembl.org/Homo_sapiens/Gene/Idhistory?g=ENSG00000230939, https://www.ensembl.org/Homo_sapiens/Gene/Idhistory?g=ENSG00000069712).

 

You can get an unmodified version if you set the argument SummarizedExperiment to FALSE (there is an example here http://rpubs.com/tiagochst/five_5_percent_doubt with a table of the genes we could not map in the most updated version)

 

 

ADD REPLY

Login before adding your answer.

Traffic: 963 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6