Question

tximport, Kallisto and Deseq2 (quick answer)

0

Entering edit mode

Mozart ▴ 30

@mozart-20625

Last seen 5.2 years ago

Hello there, I am trying to analyse a dataset using deseq2 with kallisto via tximport. I am using the following code:

tximport(files, type = "kallisto", tx2gene = tx2gene, ignoreTxVersion = TRUE) So, in this case the counts from abundances are 'no', I guess (because I am not specifying so It should come as default setting). These counts are not normalised for length bias, right? But if I got it right, because of DESeqDataSetFromTximport function:

Note: there are two suggested ways of importing estimates for use with differential gene expression (DGE) methods. The first method, which we show below for edgeR and for DESeq2, is to use the gene-level estimated counts from the quantification tools, and additionally to use the transcript-level abundance estimates to calculate a gene-level offset that corrects for changes to the average transcript length across samples. The code examples below accomplish these steps for you, keeping track of appropriate matrices and calculating these offsets. For edgeR you need to assign a matrix to y$offset, but the function DESeqDataSetFromTximport takes care of creation of the offset for you. Let’s call this method “original counts and offset”.

I circumvent the length bias because the DESeqDataSetFromTximport function automatically correct the counts, coming from the tsv file, for length bias, right?:

dds <- DESeqDataSetFromTximport(txi, sampleTable, ~condition)

but if someone could confirm this, that would be great.

kallisto tximport deseq2 counts • 6.6k views

ADD COMMENT • link 6.7 years ago • updated 5.8 years ago Mozart ▴ 30

score 1 · Answer 1 · 2019-04-26

1

Entering edit mode

Michael Love 43k

@mikelove

Last seen 5 days ago

United States

The helper function DESeqDataSetFromTximport takes care of everything for you. It uses counts plus a length offset by default, but if it detects that you used scaledTPM or lengthScaledTPM then it doesn’t bring an offset. It does the right thing either way.

ADD COMMENT • link 6.7 years ago Michael Love 43k

0

Entering edit mode

Thanks a lot for your quick reply. So, just for clarity sake, that may be one of the possible solution:

txi.kallisto.tsv <- tximport(files, type = "kallisto", tx2gene = tx2gene, ignoreTxVersion = TRUE) 
sampleTable <- data.frame(condition = factor(c("a","a","a","b","b","b")) 
rownames(sampleTable) <- colnames(txi.kallisto.tsv$counts) 
dds <- DESeqDataSetFromTximport(txi.kallisto.tsv, sampleTable, ~condition)

dds <- DESeq(dds) dds$condition <- relevel(dds$condition, ref = "b")  
dds <- DESeq(dds) 
res <- results(dds)

right?

ADD REPLY • link 6.7 years ago Mozart ▴ 30

0

Entering edit mode

Yes, this is the recommended pipeline. But using the scaled TPM would also work. The function takes care of everything

ADD REPLY • link 6.7 years ago Michael Love 43k

1

Entering edit mode

That's sounds great, thanks a lot either for this support and for giving me the opportunity to do good science thanks to your packages.

ADD REPLY • link 6.7 years ago Mozart ▴ 30