Using RSEM with tximport and DESeq2
1
0
Entering edit mode
@falkohofmann-12403
Last seen 5.0 years ago

I came across an issue when trying to import RSEM files into DESeq2 according the online tutorial.

Running the following code on my data:

txi.rsem <- tximport(files, type = "rsem")
sampleTable <- data.frame(condition = factor(rep(c('A', 'B','C','D'), each = 3)))
dds <- DESeqDataSetFromTximport(txi.rsem, sampleTable, ~condition)



gives me the error: using counts and average transcript lengths from tximport Error: all(lengths > 0) is not TRUE

After looking at the data I think this is happening when a gene is not expressed in any of the samples.
The gene will have an effective transcript length of 0 for all samples which seems to cause the issue.

Is there a quick fix for his problem?

deseq2 tximport software error bug • 5.7k views
6
Entering edit mode
@mikelove
Last seen 15 hours ago
United States

You can edit the 0 lengths to be 1, by editing the length matrix in txi, before starting with DESeq2

We're just taking the gene effective lengths as reported by RSEM, whereas for summarizing from transcript level, tximport ensures the lengths are nonzero.

Edit: (Sep 6, 2021) Some demo code for reading in RSEM gene-level counts with tximeta and dealing with 0-length values.

library(tximeta)
library(SummarizedExperiment)
se <- tximeta(coldata, type="rsem", txIn=FALSE, txOut=FALSE, skipMeta=TRUE)
assays(se)$length[ assays(se)$length == 0] <- NA # set these as missing


Examine how many genes have X missing values (consider X = half the samples):

idx <- rowSums(is.na(assays(se)$length)) >= X table(idx) se <- se[!idx,]  Impute lengths for the 0-length values: library(impute) length_imp <- impute.knn(assays(se)$length)
assays(se)$length <- length_imp$data

0
Entering edit mode

Works fine, thanks!