Question

Error reading kallisto files using tximport for cross species RNA-Seq analysis

0

Entering edit mode

urjaswita ▴ 40

@urjaswita-13128

Last seen 6.0 years ago

Hi folks,

I want to do a comparison of RNA-Seq datasets of same tissues in multiple species. I quantified the RNA-Seq using kallisto and now I want to read the kallisto files using tximport for DESeq2 analysis. My abundance files have identical transcript id for each species but the transcript lengths differ due to differences in gene lengths in diferent species. When I try to read the files using Tximport I get following error.

> sample <- read.table("samples.txt", header = T)
> files <- file.path(dir, "kallisto_results",sample$samplenames, "abundance.tsv")
> txt2gene <- read.table("Transcript-to-Gene-mapping.txt", header=T, sep = "\t")
> txi.kallisto <- tximport(files, type="kallisto", tx2gene = txt2gene)
reading in files
1 2 3 Error: all(txId == raw[[txIdCol]]) is not TRUE

The first two are for same species and the 3rd one with the error is for different species. Can someone please help me with this? I am really stuck as I cannot find out what this error mean. My guess is that this is due to difference in transcript length for the same Id in different species. But I am not sure!

Please let me know if you know what's going on here?

Thanks so much in advance!

- Urja

kallisto tximport RNA-Seq across species • 3.2k views

ADD COMMENT • link updated 4.3 years ago by nicolette.sipperly • 0 • written 8.7 years ago by urjaswita ▴ 40

0

Entering edit mode

Thanks Michael for your prompt response! It worked when I sorted the genes in the same order. I have 2 quick follow up questions:

1. Are the kallisto abundance estimates sorted in a particular order?

2. Would it affect the DESeq analysis if I sort the genes alphabetically?

I suppose the answer to both the questions is No?

Thanks again!

-Urja

ADD REPLY • link 8.7 years ago urjaswita ▴ 40

0

Entering edit mode

I believe the kallisto abundances are sorted by the order in the transcriptome FASTA. But that's a kallisto question.

The order of the genes makes no difference to the inference methods.

ADD REPLY • link 8.7 years ago Michael Love 43k

0

Entering edit mode

If you're using kallisto to quantify your sample I'd recommend abandoning tximport -> DESeq2 and using sleuth instead. It's more accurate. See https://www.nature.com/nmeth/journal/vaop/ncurrent/full/nmeth.4324.html.

ADD REPLY • link 8.6 years ago lakigigar • 0

score 0 · Answer 1 · 2017-05-26

0

Entering edit mode

Michael Love 43k

@mikelove

Last seen 1 day ago

United States

That error says the transcript ID (name) was not equal. The files need to have the same transcript IDs in the same order.

ADD COMMENT • link 8.7 years ago Michael Love 43k

0

Entering edit mode

I have sorted my tx2gene map and Kallisto abundance files to have the same transcripts and same order, however I am still receiving this error. Is there another issue that would cause this error?

ADD REPLY • link 4.3 years ago nicolette.sipperly • 0

0

Entering edit mode

Could you post a new question, following the guide:

http://bioconductor.org/help/support/posting-guide/#keypoints

ADD REPLY • link 4.3 years ago Michael Love 43k

0

Entering edit mode

Hi Michael,

I just posted with the info asked for on the guide. I called it: tximport error: all(txId == raw[[txIdCol]]) is not TRUE

Please let me know if there is anything else I should provide.

Thank you for your response -- it is very much appreciated!!

Nicolette

ADD REPLY • link 4.3 years ago nicolette.sipperly • 0