problem with TXNAME -> SYMBOL mapping in Homo.sapiens library
1
0
Entering edit mode
Guest User ★ 13k
@guest-user-4897
Last seen 10.3 years ago
Hello, I have a problem with the maping from txname to symbol of the gene. For most transcripts it works ok, but for some it doesn't: > library(Homo.sapiens) > select(Homo.sapiens, cols="SYMBOL", keys= "uc021wml.1", keytype="TXNAME") Error in .testIfKeysAreOfProposedKeytype(x, keys, keytype) : None of the keys entered are valid keys for the keytype specified. The traceback is as follows: > traceback() 10: stop("None of the keys entered are valid keys for the keytype specified.") 9: .testIfKeysAreOfProposedKeytype(x, keys, keytype) 8: .select(x, keys, cols, keytype, jointype = jointype) 7: .local(x, keys, cols, keytype, ...) 6: select(.makeReal(nodeName), keys = fromKeys, cols = needCols[[nodeName]], keytype = toKey) 5: select(.makeReal(nodeName), keys = fromKeys, cols = needCols[[nodeName]], keytype = toKey) 4: .getSelects(x, keytype, keys, needCols, visitNodes) 3: .select(x, keys, cols, keytype, ...) 2: select(Homo.sapiens, cols = "SYMBOL", keys = "uc021wml.1", keytype = "TXNAME") 1: select(Homo.sapiens, cols = "SYMBOL", keys = "uc021wml.1", keytype = "TXNAME") However, When I try to check whether the problematic txname is present in Homo.sapiens database, it occurs that it is there. I can also find some other information about this transcript: > "uc021wml.1" %in% keys(Homo.sapiens, keytype="TXNAME") [1] TRUE > select(Homo.sapiens, cols="TXSTART", keys= "uc021wml.1", keytype="TXNAME") TXNAME TXSTART 1 uc021wml.1 22385572 Is there a way to solve that problem? I would be appreciated for your help. Best regards, Aleksandra Pfeifer -- output of sessionInfo(): > sessionInfo() R version 3.0.1 (2013-05-16) Platform: x86_64-unknown-linux-gnu (64-bit) locale: [1] C attached base packages: [1] parallel stats graphics grDevices utils datasets methods [8] base other attached packages: [1] Homo.sapiens_1.1.1 [2] TxDb.Hsapiens.UCSC.hg19.knownGene_2.9.2 [3] org.Hs.eg.db_2.9.0 [4] GO.db_2.9.0 [5] RSQLite_0.11.4 [6] DBI_0.2-7 [7] OrganismDbi_1.2.0 [8] GenomicFeatures_1.12.4 [9] GenomicRanges_1.12.5 [10] IRanges_1.18.4 [11] AnnotationDbi_1.22.6 [12] Biobase_2.20.1 [13] BiocGenerics_0.6.0 loaded via a namespace (and not attached): [1] BSgenome_1.28.0 Biostrings_2.28.0 RBGL_1.36.2 RCurl_1.95-4.1 [5] Rsamtools_1.12.4 XML_3.98-1.1 biomaRt_2.16.0 bitops_1.0-6 [9] graph_1.38.3 rtracklayer_1.20.4 stats4_3.0.1 tools_3.0.1 [13] zlibbioc_1.6.0 > -- Sent via the guest posting facility at bioconductor.org.
GO GO • 2.3k views
ADD COMMENT
0
Entering edit mode
Marc Carlson ★ 7.2k
@marc-carlson-2264
Last seen 8.4 years ago
United States
Hi Aleksandra, That's a good question! So first of all you may want to know that the newer packages don't even have a name for that transcript. It has been dropped from the latest Transcriptomes coming out of UCSC. But it's still a great question, so allow me to also answer about what is happening in this older data that you are using. In these older packages, there was a transcript name from UCSC, but it was *not* associated with any GENE IDs. Thus it is a valid key, because it can be mapped to "some" values inside the transcriptome, but it is not mappable to anything "outside" of the Transcriptome. You almost had enough information to see this for yourself with the select queries that you ran. So for example if you did the following select: select(Homo.sapiens, cols=c("GENEID","TXSTART"), keys= "uc021wml.1", keytype="TXNAME") You will get: TXNAME GENEID TXSTART 1 uc021wml.1 <na> 22385572 This actually tells you that while there *is* transcript information for this name ("TXCHROM" etc. will also work), there is still no GENEID associated with it. Unfortunately: no gene ID means there is also no way to look up information like gene SYMBOL or any other data that is associated at the gene level. So the short answer is that there is no gene symbol for this transcript name because we don't have any way to know what gene it belongs to. Hope this helps, Marc On 09/27/2013 02:28 AM, Aleksandra Pfeifer [guest] wrote: > Hello, > I have a problem with the maping from txname to symbol of the gene. For most transcripts it works ok, but for some it doesn't: > >> library(Homo.sapiens) >> select(Homo.sapiens, cols="SYMBOL", keys= "uc021wml.1", keytype="TXNAME") > Error in .testIfKeysAreOfProposedKeytype(x, keys, keytype) : > None of the keys entered are valid keys for the keytype specified. > > The traceback is as follows: >> traceback() > 10: stop("None of the keys entered are valid keys for the keytype specified.") > 9: .testIfKeysAreOfProposedKeytype(x, keys, keytype) > 8: .select(x, keys, cols, keytype, jointype = jointype) > 7: .local(x, keys, cols, keytype, ...) > 6: select(.makeReal(nodeName), keys = fromKeys, cols = needCols[[nodeName]], > keytype = toKey) > 5: select(.makeReal(nodeName), keys = fromKeys, cols = needCols[[nodeName]], > keytype = toKey) > 4: .getSelects(x, keytype, keys, needCols, visitNodes) > 3: .select(x, keys, cols, keytype, ...) > 2: select(Homo.sapiens, cols = "SYMBOL", keys = "uc021wml.1", keytype = "TXNAME") > 1: select(Homo.sapiens, cols = "SYMBOL", keys = "uc021wml.1", keytype = "TXNAME") > > > However, When I try to check whether the problematic txname is present in Homo.sapiens database, it occurs that it is there. I can also find some other information about this transcript: >> "uc021wml.1" %in% keys(Homo.sapiens, keytype="TXNAME") > [1] TRUE >> select(Homo.sapiens, cols="TXSTART", keys= "uc021wml.1", keytype="TXNAME") > TXNAME TXSTART > 1 uc021wml.1 22385572 > > Is there a way to solve that problem? I would be appreciated for your help. > > Best regards, > Aleksandra Pfeifer > > > > -- output of sessionInfo(): > >> sessionInfo() > R version 3.0.1 (2013-05-16) > Platform: x86_64-unknown-linux-gnu (64-bit) > > locale: > [1] C > > attached base packages: > [1] parallel stats graphics grDevices utils datasets methods > [8] base > > other attached packages: > [1] Homo.sapiens_1.1.1 > [2] TxDb.Hsapiens.UCSC.hg19.knownGene_2.9.2 > [3] org.Hs.eg.db_2.9.0 > [4] GO.db_2.9.0 > [5] RSQLite_0.11.4 > [6] DBI_0.2-7 > [7] OrganismDbi_1.2.0 > [8] GenomicFeatures_1.12.4 > [9] GenomicRanges_1.12.5 > [10] IRanges_1.18.4 > [11] AnnotationDbi_1.22.6 > [12] Biobase_2.20.1 > [13] BiocGenerics_0.6.0 > > loaded via a namespace (and not attached): > [1] BSgenome_1.28.0 Biostrings_2.28.0 RBGL_1.36.2 RCurl_1.95-4.1 > [5] Rsamtools_1.12.4 XML_3.98-1.1 biomaRt_2.16.0 bitops_1.0-6 > [9] graph_1.38.3 rtracklayer_1.20.4 stats4_3.0.1 tools_3.0.1 > [13] zlibbioc_1.6.0 > > -- > Sent via the guest posting facility at bioconductor.org. > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
ADD COMMENT

Login before adding your answer.

Traffic: 420 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6