Question

TxImport when quant.sf files have different index files

0

Entering edit mode

mprnl • 0

@mprnl-25060

Last seen 3.1 years ago

Hi,

I am conducting a metatranscriptomic RNAseq analysis. RNA-seq data was obtained from several environmental samples, after which the reads were assembled into contigs per sample. For the salmon quantification, indexes were built per assembly. Raw reads were then quantified against the assembled contigs. The contigs were annotated against a reference database with diamond, obtaining KEGG KO terms and a taxonomic annotation when available, otherwise contigs were marked as 'unknown'.

I want to load in all these results for analysis with DESeq2 using tximport. Tximport however needs a tx2gene object, and I tried creating one myself using the contig names and the KO terms. However, I keep on running into this error:

 reading in files with read_tsv
1 2 Error in tximport(salmon_counts, type = "salmon", tx2gene = tx2gene) : 
  all(txId == raw[[txIdCol]]) is not TRUE
In addition: Warning message:
In txId == raw[[txIdCol]] :
  longer object length is not a multiple of shorter object length

I checked and the same numbers of contigs are present in the quant.sf files and the tx2gene dataframe. Is there any way to circumvent this that allows to create deseq data from metatranscriptomic de novo assembled & annotated contigs & count files?

Thanks in advance!

tximport DESeq2 • 993 views

ADD COMMENT • link 3.1 years ago mprnl • 0

score 2 · Accepted Answer · 2021-03-17

2

Entering edit mode

Michael Love 41k

@mikelove

Last seen 3 hours ago

United States

tximport and tximeta rely on the samples being quantified against the same transcriptome. There are many places in the code where this assumption is relied upon / enforced, so you're likely better off just scripting this from scratch. It will be slower I think that tximport because you won't be able to use aggregating functions that operate on matrices such as rowsums.