how to match salmon output to txname
2
0
Entering edit mode
@6db15c42
Last seen 16 months ago
Japan

Hello. I am new to RNAseq analysis and I have been finding a hard time to solve the mismatch between my salmon output to tx2gene 'txname'

I use salmon alignment-based mode, with transcript fasta file extracted via the gff3 file downloaded from NCBI

#command used for generating transcript :
gffread -w transcripts.fasta -g tn2-sequence.fasta tn2-sequence.gff3

#salmon output looks like this 
Name    Length  EffectiveLength TPM NumReads
gene-DO80_00010 417 167.000 53.992327   7.000
gene-DO80_00020 471 221.000 17.485556   3.000

I was preparing the data for DESeq2, hence for tximport I need to make tx2gene file, in which I make with the code below :

library(GenomicFeatures)
gff_file <- "tn2-sequence.gff3"
file.exists(gff_file)
txdb <- makeTxDbFromGFF(gff_file)
keytypes(txdb)
columns(txdb)

#gene names to transcript only
k <- keys(txdb, keytype="TXNAME")
tx_map <- AnnotationDbi::select(txdb, keys = k, 
                                columns="GENEID", keytype = "TXNAME")
view(tx_map) 
tx2gene <- tx_map
write.csv(tx2gene,file="tx2gene.csv",row.names = FALSE,quote=FALSE)
view (tx2gene)

It gives me the input as such :

TXNAME     GENEID
1 DO80_00050 DO80_00050
2      panC DO80_00060

Due to this mismatch, I cannot move on to the downstream analysis. Do you have any solution to this? Which one should I modify and how? Thank you

GenomicFeatures tximport salmon • 1.1k views
ADD COMMENT
0
Entering edit mode
@mikelove
Last seen 5 hours ago
United States

You just need a file that lists the transcript names that you used for quantification and then the gene they are associated with. Maybe you can find someone to consult with on how to do this for your particular dataset.

ADD COMMENT

Login before adding your answer.

Traffic: 910 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6