Expected counts from RSEM in DESeq2
1
1
Entering edit mode
@chudarchudar-9587
Last seen 3.0 years ago

Hi All,

I am new to DESeq2 analysis and I follow the trinity pipeline for DESeq2 analysis. In that pipeline, RSEM is used to quantify the transcript abundance which generates the expected counts. These expected counts will be rounded off and later fed into DESeq2 pipeline for further analysis.

I would like to know whether these expected counts generated from RSEM can be fed into DESeq2 instead of raw counts for computing differential expressed genes?

Regards

Chudar

RSEM deseq2 expected_counts raw_counts • 10k views
4
Entering edit mode
@mikelove
Last seen 2 days ago
United States

Yes, RSEM expected counts can be used with DESeq2.

The recommended pipeline would be to use tximport(), then DESeqDataSetFromTximport().

There is an example of importing RSEM gene-level estimated counts in the tximport vignette.

The tximport pipeline in addition to just reading in the counts table, incorporates the average transcript length per gene as a normalization factor for gene-level DE analysis. See the citation listed at the tximport landing page for more details:

https://bioconductor.org/packages/release/bioc/html/tximport.html

0
Entering edit mode

Hi! I must import RSEM data to DESeq2 for downstream analyses. My RSEM output has the following columns:

gene_id transcript_id(s) length effective_length expected_count TPM FPKM

I tried to follow the tximport tutorial from

https://bioconductor.org/packages/release/bioc/vignettes/tximport/inst/doc/tximport.html

but I got the following error:

Error in DESeqDataSetFromTximport(txi, sampleTable, ~condition) :
all(lengths > 0) is not TRUE
Calls: DESeqDataSetFromTximport -> stopifnot
Warning message:
In DESeqDataSet(se, design = design, ignoreRank) :
45 duplicate rownames were renamed by adding numbers


My code is as follows:

library(tximportData)
library(tximport)
library(DESeq2)

# files is a vector with the list of 10 RSEM output files
names(files) <- paste0("sample", 1:10)

#import files
txi <- tximport(files, type = "rsem", txIn = FALSE, txOut = FALSE)
names(txi)

# cond is a vector with conditions to be used for differential analysis
sampleTable <- data.frame(condition = factor(cond))
rownames(sampleTable) <- colnames(txi$counts) dds <- DESeqDataSetFromTximport(txi, sampleTable, ~condition)  Do you have any suggestions? Thanks! > sessionInfo() R version 4.0.2 (2020-06-22) Platform: x86_64-apple-darwin13.4.0 (64-bit) Running under: macOS Catalina 10.15.7 Matrix products: default BLAS/LAPACK: /Users/miniconda3/envs/bioinfo/lib/libopenblasp-r0.3.7.dylib locale: [1] C attached base packages: [1] stats graphics grDevices utils datasets methods base loaded via a namespace (and not attached): [1] compiler_4.0.2  ADD REPLY 1 Entering edit mode ADD REPLY 0 Entering edit mode Thanks a lot for the helpful reply; adding txi$length[txi\$length <= 0] <- 1


before

dds <- DESeqDataSetFromTximport(txi, sampleTable, ~condition)


has apparently solved the problem.

BTW, is it possible to import into DESeq2 rsem data and combine them with data from featureCounts in order build a merged matrix with normalized counts? - Thanks!

1
Entering edit mode

That is not trivial I would say, you would need to correct for / model the differences in quantification method somehow. I haven't attempted this.