Hi,
I'm currently using TCGA Data in my project. I'm trying to establish a pipeline to analyse these data but I have a question regarding to some statistical problem that I am not aware of.
Before downloading the TCGA Data, I could check on TCGA website that the available data (regarding hg19 genome - I'm using data level 3) the gene counts are estimated by RSEM:
- TCGA (1) : https://wiki.nci.nih.gov/display/tcga/rnaseq+version+2
- TCGA (2): https://wiki.nci.nih.gov/display/TCGA/Data+Levels+and+Data+Types
- RSEM: https://bmcbioinformatics.biomedcentral.com/track/pdf/10.1186/1471-2105-12-323?site=bmcbioinformatics.biomedcentral.com
I just want to know, once these data are preprocessed using RSEM, if I can put these read counts table into DESEQ2. I wanna know either if could happen some statistical inconsistence using these data on DESeq2.
* I alread checked this posts:
But I still confued.
Thank you all.
---------------------------------------------------------------------------------------------------------------------------
To download data, I'm using TCGAbiolinks as following:
if (!require("TCGAbiolinks")) {
     source("https://bioconductor.org/biocLite.R")
     biocLite("TCGAbiolinks")
     library("TCGAbiolinks")
 }
 if (!require("SummarizedExperiment")) {
      source("https://bioconductor.org/biocLite.R")
      biocLite("SummarizedExperiment")
      library("SummarizedExperiment")
   }
i = "TCGA-LUSC"
# Downloading data
query.exp.proj.gene = GDCquery(project = i,
                                         legacy = TRUE,
                                         data.category = "Gene expression",
                                         data.type = "Gene expression quantification",
                                         platform = "Illumina HiSeq",
                                         file.type = "results")
GDCdownload(query.exp.proj.gene, directory = '~/GDCdata/')
 setwd('~/GDCdata/RDAFiles')
 exp.proj.mrna = GDCprepare(query = query.exp.proj.gene, save = TRUE, save.filename = paste0(i, "-mRNA.rda"), directory = '~/GDCdata')
# Loading RDA file
load(file = paste0('~/GDCdata/RDAFiles/', i, '-mRNA.rda'))
# Count Table \/
exp.matrix = assay(data)
