org.Bt.eg.db / annotation problem
1
0
Entering edit mode
@iain-gallagher-2532
Last seen 9.3 years ago
United Kingdom
Hello List Perhaps someone could help me with this. I am annotating some 200 genes with the org.Bt.eg.db package. The identifier I have for the genes is and Ensembl ID (e.g. ENSBTAG00000009012). I am attempting to return the ENTREZ id with the following code (where rownames(topGenes$table) is my vector of Ensembl IDs): egIds <- unlist(mget(rownames(topGenes$table), org.Bt.egENSEMBL2EG, ifnotfound=NA)) This returns a named vector but it contains Ensembl IDs that were not in my query. setdiff(names(egIds), rownames(topGenes$table)) [1] "ENSBTAG000000375581" "ENSBTAG000000375582" "ENSBTAG000000312311" [4] "ENSBTAG000000312312" "ENSBTAG000000306301" "ENSBTAG000000306302" [7] "ENSBTAG000000359951" "ENSBTAG000000359952" "ENSBTAG000000005461" [10] "ENSBTAG000000005462" "ENSBTAG000000005041" "ENSBTAG000000005042" [13] "ENSBTAG000000307771" "ENSBTAG000000307772" "ENSBTAG000000135691" [16] "ENSBTAG000000135692" Could someone explain why this is happening? The IDs above (i.e. those not in my query are returned with Entrez IDs). egIds[setdiff(names(egIds), rownames(topGenes$table))] ENSBTAG000000375581 "281212" etc etc Thanks iain > sessionInfo() R version 2.13.1 (2011-07-08) Platform: x86_64-pc-linux-gnu (64-bit) locale: [1] LC_CTYPE=en_GB.utf8 LC_NUMERIC=C [3] LC_TIME=en_GB.utf8 LC_COLLATE=en_GB.utf8 [5] LC_MONETARY=C LC_MESSAGES=en_GB.utf8 [7] LC_PAPER=en_GB.utf8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_GB.utf8 LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] org.Bt.eg.db_2.5.0 RSQLite_0.9-4 DBI_0.2-5 [4] AnnotationDbi_1.14.1 Biobase_2.10.0 edgeR_2.2.5 loaded via a namespace (and not attached): [1] limma_3.6.6 tools_2.13.1 >
• 1.2k views
ADD COMMENT
0
Entering edit mode
jason0701 ▴ 190
@jason0701-3921
Last seen 5.0 years ago
Hi Iain, I think this is due to multiple matches for some keys. In your example, "ENSBTAG000000375581" and "ENSBTAG000000375582" are likely "ENSBTAG00000037558". Jason
ADD COMMENT
0
Entering edit mode
Thanks Jason. Chasing this up it seems that neither "ENSBTAG000000375581" or "ENSBTAG000000375582" are in Ensembl. The data was passed to me by others so I think I'll do some judicious filtering based on the org package. Best iain --- On Tue, 2/8/11, Jason Lu <jasonlu68 at="" gmail.com=""> wrote: > From: Jason Lu <jasonlu68 at="" gmail.com=""> > Subject: Re: [BioC] org.Bt.eg.db / annotation problem > To: bioconductor at stat.math.ethz.ch > Date: Tuesday, 2 August, 2011, 15:03 > Hi Iain, > > I think this is due to multiple matches for some keys. > In your example, "ENSBTAG000000375581" and > "ENSBTAG000000375582" are > likely "ENSBTAG00000037558". > > Jason > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor >
ADD REPLY

Login before adding your answer.

Traffic: 712 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6