RNA seq tximport
3
@tanyabioinfo-14091
Last seen 5.8 years ago
Hi
I am using the following commands to run tximport
txi <- tximport(files, type="salmon", tx2gene=tx2gene,ignoreTxVersion=TRUE,dropInfReps=TRUE)
my tx2gene dataframe looks like this
tx_id gene_id
1 ENSMUST00000082387 ENSMUSG00000064336
2 ENSMUST00000179436 ENSMUSG00000095742
3 ENSMUST00000082388 ENSMUSG00000064337
4 ENSMUST00000177695 ENSMUSG00000094121
5 ENSMUST00000082389 ENSMUSG00000064338
6 ENSMUST00000082390 ENSMUSG00000064339
>
My question is which among
summarizing abundance
summarizing counts
summarizing length
contains the TPM value from the individual quant.sf files. My other question is since the txi$abundance contains the gene ids how can I get the transcript names so that I can cross check the values in quant.sf files, to see everything is running fine.
Thanks
Tanya
tximport
• 1.2k views
@mikelove
Last seen 5 days ago
United States
Check out the help page ?tximport
You can use txOut=TRUE
to get out the transcript-level quantifications
@james-w-macdonald-5106
Last seen 3 days ago
United States
The abundance list item contains the summarized abundance measurements. If you wanted to do some spot checking, you could do
listmap <- split(tx2gene$tx_id, tx2gene$gene_id)
If I do that with a tx2gene data.frame I have handy, I get
> split(tx2gene$REFSEQ, tx2gene$ENTREZID)[1:5]
$`100135779`
[1] "NM_001126230.1"
$`100136349`
[1] "NM_001123523.1"
$`100136351`
[1] "NM_001123524.1"
$`100136352`
[1] "NM_001123525.1" "XM_014194536.1" "XM_014194537.1"
$`100136353`
[1] "NM_001123526.1"
And you can see that for example Entrez Gene ID 100136352 has three transcripts that are being summarized to generate a gene level abundance. You can then look for those rows of the data from the quant.sf file and sum by hand.
@tanyabioinfo-14091
Last seen 5.8 years ago
Thanks James
It worked for me.
Login before adding your answer.
Traffic: 549 users visited in the last hour