Tximport and pseudo alignment with Kallisto
1
0
Entering edit mode
Mozart ▴ 30
@mozart-20625
Last seen 4.1 years ago

Hi there, I am using Kallisto to generate counts in my RNA-seq experiments. Since in the downstream analysis I am preferring DESeq2, I have to use tximport for importing transcript abundances in order to perform differential expression analysis. I am slavishly following the code used in the tximport package documentation and, so far, I have never had any problems with that.

By the way, I noticed that when doing the pseudo alignment in Kallisto I can generate counts either by running all of paired samples at once

kallisto quant -i index -o output pairA_1.fastq pairA_2.fastq pairB_1.fastq pairB_2.fastq

or by running each pair at time

kallisto quant -i index -o output pairA_1.fastq pairA_2.fastq

And this is the crucial point with tximport because usually at the end of Kallisto run, I ended up with an amount of sample folders that was equal to the number of my samples in the experiment (ie 6 folders for 6 samples). This, allowed me to use the following:

files <- file.path(dir, "kallisto", samples$run, "abundance.tsv") 
names(files) <- paste0("sample", 1:6) 
txi.kallisto.tsv <- tximport(files, type = "kallisto", tx2gene = tx2gene, ignoreAfterBar = TRUE)

But I am not able to use tximport if I want to run all of my samples at once (thus, generating just 1 abundance.tsv file). Given the fact, I presume that either way the pseudoalignement is identical (and probably because this would be much easier for Sleuth users), I would stick with the method I mentioned earlier, just to make my life easier and ease the usage of tximport.

But I am just seeking confirmation of this, guys.

PS: sorry for posting such a borderline topic, it is very difficult for me to hit the ground with this technique and any relevant opinions are more than welcomed.

PPS: in the kallisto documentation they also mention as an important note that one should 'only supply one sample at a time to kallisto. The multiple FASTQ (pair) option is for users who have samples that span multiple FASTQ files.'

kallisto tximport deseq2 • 2.0k views
ADD COMMENT
3
Entering edit mode
@mikelove
Last seen 3 hours ago
United States

Your first line of code above is mistakenly collapsing multiple biological replicates into a single sample. As you say in your question the option to give multiple input files is for usage with technical replicates only.

ADD COMMENT

Login before adding your answer.

Traffic: 1048 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6