TxDb conversion of transcript id to gene id
1
0
Entering edit mode
Luke • 0
@5576e7cd
Last seen 6 months ago
United States

I am attempting to create a txdb and tx2gene for Sus scrofa transcipts.

I am struggling map my salmon transcript quant.sf files to my genome. I am using the transcript file from https://www.ncbi.nlm.nih.gov/data-hub/genome/GCF_000003025.6/, and tried to create a TxDb from the genomic sequence fast file, GFF3 and GTF. My IDs never seem to match and I am unsure how to fix this as I am pulling both files from the same site. I have also tried using the TxDb UCSC susScr11 ref genome but I get the same non-matching IDs.

Code used with error.

Code should be placed in three backticks as shown below


# include your problematic code here with any corresponding output 
# please also include the results of running the following in an R session 

sessionInfo( )
TxDb.Sscrofa.UCSC.susScr11.refGene TxDb txi • 560 views
ADD COMMENT
0
Entering edit mode
@james-w-macdonald-5106
Last seen 8 hours ago
United States

My go-to in this situation is to get the transcripts from one of the quant.sf files and then map those using the TxDb. An example using an EnsDb from AnnotationHub looks like this

## get one file
tx2gene <- read.table(paste0("../data/aligned/", samps$Sample[1], "/quant.sf"), header = TRUE)
## get an EnsDb
hub <- AnnotationHub()
ensdb <- hub[["AH95744"]]
## generate the tx2gene using the transcript IDs
tx2gene <- select(ensdb, gsub("\\.[0-9]+$", "", tx2gene[,1]), "GENEID", "TXNAME")[,1:2]

You could do something similar, substituting in the TxDb you generated.

ADD COMMENT

Login before adding your answer.

Traffic: 679 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6