I am conducting a metatranscriptomic RNAseq analysis. RNA-seq data was obtained from several environmental samples, after which the reads were assembled into contigs per sample. For the salmon quantification, indexes were built per assembly. Raw reads were then quantified against the assembled contigs. The contigs were annotated against a reference database with diamond, obtaining KEGG KO terms and a taxonomic annotation when available, otherwise contigs were marked as 'unknown'.
I want to load in all these results for analysis with DESeq2 using tximport. Tximport however needs a tx2gene object, and I tried creating one myself using the contig names and the KO terms. However, I keep on running into this error:
reading in files with read_tsv 1 2 Error in tximport(salmon_counts, type = "salmon", tx2gene = tx2gene) : all(txId == raw[[txIdCol]]) is not TRUE In addition: Warning message: In txId == raw[[txIdCol]] : longer object length is not a multiple of shorter object length
I checked and the same numbers of contigs are present in the quant.sf files and the tx2gene dataframe. Is there any way to circumvent this that allows to create deseq data from metatranscriptomic de novo assembled & annotated contigs & count files?
Thanks in advance!