Question: RNA seq tximport
26 days ago by




I am using the following commands to run tximport

txi <- tximport(files, type="salmon", tx2gene=tx2gene,ignoreTxVersion=TRUE,dropInfReps=TRUE)

my tx2gene dataframe looks like this

             tx_id            gene_id
1 ENSMUST00000082387 ENSMUSG00000064336
2 ENSMUST00000179436 ENSMUSG00000095742
3 ENSMUST00000082388 ENSMUSG00000064337
4 ENSMUST00000177695 ENSMUSG00000094121
5 ENSMUST00000082389 ENSMUSG00000064338
6 ENSMUST00000082390 ENSMUSG00000064339

My question is which among

summarizing abundance
summarizing counts
summarizing length

contains the TPM value from the individual quant.sf files. My other question is since the txi$abundance contains the gene ids how can I get the transcript names so that I can cross check the values in quant.sf files, to see everything is running fine.



ADD COMMENTlink modified 25 days ago • written 26 days ago by tanyabioinfo10
26 days ago by
Michael Love14k
United States


Check out the help page ?tximport

You can use txOut=TRUE to get out the transcript-level quantifications

ADD COMMENTlink written 26 days ago by Michael Love14k
26 days ago by
United States


The abundance list item contains the summarized abundance measurements. If you wanted to do some spot checking, you could do

listmap <- split(tx2gene$tx_id, tx2gene$gene_id)

If I do that with a tx2gene data.frame I have handy, I get

> split(tx2gene$REFSEQ, tx2gene$ENTREZID)[1:5]
[1] "NM_001126230.1"

[1] "NM_001123523.1"

[1] "NM_001123524.1"

[1] "NM_001123525.1" "XM_014194536.1" "XM_014194537.1"

[1] "NM_001123526.1"

And you can see that for example Entrez Gene ID 100136352 has three transcripts that are being summarized to generate a gene level abundance. You can then look for those rows of the data from the quant.sf file and sum by hand.


ADD COMMENTlink written 26 days ago by James W. MacDonald45k
25 days ago by


Thanks James

It worked for me.


ADD COMMENTlink written 25 days ago by tanyabioinfo10
