TxImport when quant.sf files have different index files
1
0
Entering edit mode
mprnl • 0
@mprnl-25060
Last seen 7 months ago

Hi,

I am conducting a metatranscriptomic RNAseq analysis. RNA-seq data was obtained from several environmental samples, after which the reads were assembled into contigs per sample. For the salmon quantification, indexes were built per assembly. Raw reads were then quantified against the assembled contigs. The contigs were annotated against a reference database with diamond, obtaining KEGG KO terms and a taxonomic annotation when available, otherwise contigs were marked as 'unknown'.

I want to load in all these results for analysis with DESeq2 using tximport. Tximport however needs a tx2gene object, and I tried creating one myself using the contig names and the KO terms. However, I keep on running into this error:

 reading in files with read_tsv
1 2 Error in tximport(salmon_counts, type = "salmon", tx2gene = tx2gene) :
all(txId == raw[[txIdCol]]) is not TRUE
In txId == raw[[txIdCol]] :
longer object length is not a multiple of shorter object length


I checked and the same numbers of contigs are present in the quant.sf files and the tx2gene dataframe. Is there any way to circumvent this that allows to create deseq data from metatranscriptomic de novo assembled & annotated contigs & count files?

tximport DESeq2 • 198 views
2
Entering edit mode
@mikelove
Last seen 6 hours ago
United States

tximport and tximeta rely on the samples being quantified against the same transcriptome. There are many places in the code where this assumption is relied upon / enforced, so you're likely better off just scripting this from scratch. It will be slower I think that tximport because you won't be able to use aggregating functions that operate on matrices such as rowsums.

0
Entering edit mode