Search
Question: tximport using for data generated from other tools
0
12 months ago by
saumya.kumar0 wrote:

Hello,

I want to try using tximport on another tool that I am evaluating, eXpress. For this I converted the quantification files  generated by eXpress programatically into salmon's quant.sf format and then thought to give it a try. I get the following error:

reading in files
1 2 Error: all(txId == raw[[txIdCol]]) is not TRUE

This looks like pointing towards the txID, which I have checked in file 2 and it doesn't appear empty at all. I am using this command:

txi.express <-tximport(Files,type = "salmon", tx2gene = tx2gene)

The tx2gene file consists of Ensembl transcript ids and gene Ids. I have used this file for Salmon results before and it had worked.

Is there a way I can do this? tximport seems to be able to read file 1 without any error and its at file 2 it seems to give this error.

Thanks,

Saumya

> sessionInfo()
R version 3.3.1 (2016-06-21)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows >= 8 x64 (build 9200)

locale:
[1] LC_COLLATE=English_United Kingdom.1252  LC_CTYPE=English_United Kingdom.1252
[3] LC_MONETARY=English_United Kingdom.1252 LC_NUMERIC=C
[5] LC_TIME=English_United Kingdom.1252

attached base packages:
[1] stats4    parallel  stats     graphics  grDevices utils     datasets  methods   base

other attached packages:
[1] DESeq2_1.12.4                            SummarizedExperiment_1.2.3
[3] TxDb.Mmusculus.UCSC.mm10.knownGene_3.2.2 GenomicFeatures_1.24.5
[5] AnnotationDbi_1.34.4                     Biobase_2.32.0
[7] GenomicRanges_1.24.3                     GenomeInfoDb_1.8.7
[9] IRanges_2.6.1                            S4Vectors_0.10.3
[11] BiocGenerics_0.18.0                      tximport_1.0.3

loaded via a namespace (and not attached):
[1] genefilter_1.54.2       locfit_1.5-9.1          splines_3.3.1           lattice_0.20-35
[5] colorspace_1.3-2        htmltools_0.3.6         rtracklayer_1.32.2      base64enc_0.1-3
[9] survival_2.41-3         XML_3.98-1.7            foreign_0.8-68          DBI_0.6-1
[13] BiocParallel_1.6.6      RColorBrewer_1.1-2      plyr_1.8.4              stringr_1.2.0
[17] zlibbioc_1.18.0         Biostrings_2.40.2       munsell_0.4.3           gtable_0.2.0
[21] htmlwidgets_0.8         memoise_1.1.0           latticeExtra_0.6-28     knitr_1.15.1
[25] geneplotter_1.50.0      biomaRt_2.28.0          htmlTable_1.9           Rcpp_0.12.10
[29] xtable_1.8-2            acepack_1.4.1           scales_0.4.1            backports_1.0.5
[33] checkmate_1.8.2         annotate_1.50.1         Hmisc_4.0-3             XVector_0.12.1
[37] Rsamtools_1.24.0        gridExtra_2.2.1         ggplot2_2.2.1           digest_0.6.12
[41] stringi_1.1.5           grid_3.3.1              tools_3.3.1             bitops_1.0-6
[45] magrittr_1.5            RCurl_1.95-4.8          lazyeval_0.2.0          RSQLite_1.1-2
[49] tibble_1.3.0            Formula_1.2-1           cluster_2.0.6           Matrix_1.2-10
[53] data.table_1.10.4       rpart_4.1-11            GenomicAlignments_1.8.4 nnet_7.3-12


modified 6 weeks ago by Michael Love18k • written 12 months ago by saumya.kumar0
1
12 months ago by
Michael Love18k
United States
Michael Love18k wrote:

tximport needs for all the 'files' to have the transcripts in the same order. The package is designed to import quantifications of multiple samples which were run using the same software against the same transcriptome. You can import individual files separately using tximport, if you want to read in files run with multiple software. We do not support the merging of different quantifications across software though.

Hi Michael,

I am trying to set up a pipeline for interspecies RNA-seq comparison (human and primates), and I was testing if Salmon, tximport and DESeq2 would work. I ran salmon on my fastq files, using indexes for respective species. However, when I try to load them all, together with a concatenated tx2gene annotation file, into a "txi" object using tximport, I get exact same error message as described above (" Error: all(txId == raw[[txIdCol]]) is not TRUE"). When I load the samples + tx2gene files for each species separately, all works fine. As I know the orthologous correspondence between the genes from different species, I thought of somehow merging these individual "txi" objects together, and load it into DESEq.

However I'm not sure how to proceed : 1) whether it is possible to merge txi files by a set of orthologous genes into one txi object? If yes, is it possible to load it then into DEseq2?

Thanks, Alex

I'd recommend to do this manually. tximport() is assuming that all the files have the same set of transcripts.

Thanks!

a couple more questions:

(btw, shall I continue here, or start a new thread?)

1) I see that the "txi" object is a list with 4 elements, 3 of which are matrices (abundance, counts, length). Are all of these needed to load the data into DESeq2? Shall I manually re-create all these elements separately, by merging from individual species-specific "txi" data?

2) I have a table of orthologous geneIDs between the 3 species. So, at some point in the analysis I will need to convert the primate species geneIDs to human geneIDs. Is it better to do that before employing tximport on them (I can generate tx2gene tables with, say, macaqueTxID vs humanGeneID, to do so). Alternatively, I could do that before merging individual txi datasets into one "txi" object.

1) Safest to just make a list with all 4 elements. 3 merged matrices, and then the logical countsFromAbundance value.

2) You can use tximport to merge straight from primate txp to human genes.