Question: Importing Salmon counts with tximport with host and virus genomes
0
3 months ago by
rbenel0
rbenel0 wrote:

Hi, I have used Salmon on some new RNAseq data that includes a host and virus (human and virus). When indexing the transcriptome, I concatenated the virus genome (as they don't have transcriptomes) to the .fatsa file of the host. I did the same with the .gtf file used for building the txdb object. However, following the tximport the virus is not included in the counts object. I believe the issue is when constructing the tx2gene tibble, as there is no TXNAME for the virus...

#construct a tx v. gene name tibble

k <- keys(txdb, keytype = "TXNAME")

tx2gene <- AnnotationDbi::select(txdb, k, "GENEID", "TXNAME")

Any suggestions for a work around for this? I would really like to compare the virus counts per sample :)

Thank You!

salmon tximport • 115 views
modified 3 months ago by Michael Love25k • written 3 months ago by rbenel0
Answer: Importing Salmon counts with tximport with host and virus genomes
0
3 months ago by
Michael Love25k
United States
Michael Love25k wrote:

Let me just explain what tximport does for gene summarization: for every transcript in the quant.sf file, it looks up the gene name, and then collapses the information from the transcript level to the gene level. If you want the viral transcripts to pass through without being summarized, you should add rows to the tx2gene table which have the transcript name in the first and second column.

Hi @michael Love,

I know that tximport does gene summarization for every transcript, it is just that there are no transcripts in the virus (or at least not in the FASTA file I used)... but I understand what you mean, manually add the viral genes to the tx2gene object with the same name working for TXNAME and GENENAME

Thanks