Can't load UCSC transcript IDs
2
0
Entering edit mode
@b43229d8
Last seen 22 months ago
United Kingdom

I'm having some trouble using the following function to load the gene id's from UCSC to annotate my RNAseq results - getting the following error when i try to implement it:

Error in .testForValidKeys(x, keys, keytype, fks) : 
  None of the keys entered are valid keys for 'TXNAME'. Please use the keys method to see a listing of valid arguments.

The function:

transcriptToSymbolAndDescription = function(ucsc_transcriptID)
{
    df = select(TxDb.Mmusculus.UCSC.mm10.knownGene,
                keys=ucsc_transcriptID,
                columns='GENEID',
                keytype='TXNAME')

    symbolAndDescription = select(org.Mm.eg.db, df$GENEID, c("SYMBOL", "GENENAME"))
    row.names(symbolAndDescription) = ucsc_transcriptID
    return(symbolAndDescription)
}

any help or suggestions would be greatly appreciated! :)

RNASeq • 887 views
ADD COMMENT
1
Entering edit mode
Guido Hooiveld ★ 3.9k
@guido-hooiveld-2020
Last seen 5 hours ago
Wageningen University, Wageningen, the …

It is not fully clear what you exact input is, but you likely will need to use the Homo.sapiens set of annotation packages. See e.g. answer James here: Converting between UCSC id and gene symbol with bioconductor annotation resources

ADD COMMENT
1
Entering edit mode
shepherl 3.8k
@lshep
Last seen 6 hours ago
United States

you can list the valid keys for the Txdb object as the ERROR suggests with keys

library(TxDb.Mmusculus.UCSC.mm10.knownGene)

txdb= TxDb.Mmusculus.UCSC.mm10.knownGene

> head(keys(txdb, keytype="TXNAME"))
[1] "ENSMUST00000193812.1" "ENSMUST00000082908.1" "ENSMUST00000192857.1"
[4] "ENSMUST00000161581.1" "ENSMUST00000192183.1" "ENSMUST00000193244.1"

If this is not the format of your given ucsc_transcriptID, than you would probably want to use a different keytype, perhaps TXID?

> keytypes(txdb)
[1] "CDSID"    "CDSNAME"  "EXONID"   "EXONNAME" "GENEID"   "TXID"     "TXNAME"  
> head(keys(txdb, keytype="TXID"))
[1] "1" "2" "3" "4" "5" "6"

The ERROR is saying none of the given ucsc_transcriptID can be found, so another quick check is

any((ucsc_transcriptID %in% keys(txdb, keytype='TXNAME')))
ADD COMMENT

Login before adding your answer.

Traffic: 771 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6