Here is the basic example for downloading rna-seq data for gbm from TCGA using TCGABiolinks.
# mRNA pipeline: https://gdc-docs.nci.nih.gov/Data/Bioinformatics_Pipelines/Expression_mRNA_Pipeline/
query.exp.hg38 <- GDCquery(project = "TCGA-GBM",
data.category = "Transcriptome Profiling",
data.type = "Gene Expression Quantification",
workflow.type = "HTSeq - FPKM-UQ",
barcode = c("TCGA-14-0736-02A-01R-2005-01", "TCGA-06-0211-02A-02R-2005-01"))
GDCdownload(query.exp.hg38)
expdat <- GDCprepare(query = query.exp.hg38,
save = TRUE,
save.filename = "exp.rda")
My question is: is there a way to filter for what genes we want to include in the downloaded summarized experiment object? What if I just want to look at, for example, expression of ribosomal proteins and don't want to download the thousands of other genes? Am thinking of using TCGABiolinks in a shiny app where the user can select the genes they want, so not downloading everything will be a lot faster.
If this is not possible with TCGAbiolinks is there another package that has the functionality I'm looking for?