I have a fairly simple question that I know has been addressed many times: I want to import RSEM data into DESeq2 for modeling and DE. For reproducibility, the workflow is:
Unfortunately, it is costing me an inordinate amount of time, and I cannot find a perfectly analogous example either in any vignette (for DESeq2, tximport, tximeta, tximportData, etc) or in other posts.
libsNeeded<-c( "cBioPortalData", "DESeq2", "Tximport", "SummarizedExperiment") BiocManager::install(libsNeeded, update = TRUE, ask = FALSE) lapply(libsNeeded, library, character.only = TRUE)
Get a MultiAssayExperiment using cBioPortal
cbio <- cBioPortal() getStudies(api = cbio) studies <- getStudies(cbio, buildReport = TRUE) CptacMae<-cBioDataPack("brain_cptac_2020", ask=FALSE) CptacRnaSeqSE<-experiments(BrainCptac2020Mae)$mrna_seq_v2_rsem
Look at resulting MAE object
CptacMae # Returns, a MultiAssayExperiment with 6 listed experiments CptacRnaSeqSE # returns a Summarized Experiment, and: assays(CptacMae)$mrna_seq_v2_rsem # returns a DataFrame of 18,000 rows and 188 cols. row.names(CptacRnaSeqSE) # returns HUGO gene symbols - so this is **not** transcript level info all(row.names(colData(CptacRnaSeqSE) ) == colnames(assays(CptacRnaSeqSE)[])) # TRUE
Determine prior processing of Rna-Seq data by reading the metadata from cBioPortal:
profile_name: RNA expression
profile_description: Log2 of FPKM + 1 expression values. STAR v2.6.1d aligned + RSEM v1.3.1 quantification.
Of course, I cannot load these data directly with
DESeqDataSetFromMatrix() because the values are decimal and it wouldn't be appropriate.
Using RSEM with tximport and DESeq2 provides a workflow using:
library(tximeta) se <- tximeta(ColData, type="rsem", txIn=FALSE, txOut=FALSE, skipMeta=TRUE) assays(se)$length[ assays(se)$length == 0] <- NA # set these as missing
but trying this on my data returns:
tximeta(CptacRnaSeqSE, type = "rsem", txIn = FALSE, txOut = FALSE, skipMeta = TRUE)
all(c("files", "names") %in% names(coldata)) is not TRUE
I do not have files, I am working from a SE directly. This would seem like it should be the easiest scenario on paper - it's already in the native language in an appropriate data object - but its proving to be more difficult than just downloading flat files would have been ...
What am I missing here? Thank you!