Genbank accession annotation?
1
1
Entering edit mode
Ed Siefker ▴ 230
@ed-siefker-5136
Last seen 12 months ago
United States
What package would I need to transform Genbank accession numbers into gene symbols or entrez gene ids? e.g.If I search "R28020" on NCBI, it tells me that "This EST is one of 1366 sequences matched to RAB2A: RAB2A, member RAS oncogene family. " Is there a metadata package that has this kind of information in it? I have a couple hundred such identifiers that I need to map to genes. I'd like to be able to run getSYMBOL("R28020", "some_annotation_package") and get a useful result. Any ideas?
• 2.2k views
ADD COMMENT
0
Entering edit mode
@james-w-macdonald-5106
Last seen 4 hours ago
United States
Hi Ed, Hypothetically you would want to use the org.Hs.eg.db package. However, not all GenBank assession numbers will be annotated, presumably because they have been retired. Alternately you could use biomaRt as well. However, the example ID you give is not annotated by either source. Best, Jim On Wednesday, October 02, 2013 4:05:51 PM, Ed Siefker wrote: > What package would I need to transform Genbank accession numbers into > gene symbols or entrez gene ids? e.g.If I search "R28020" on NCBI, it tells > me that "This EST is one of 1366 sequences matched to RAB2A: RAB2A, > member RAS oncogene family. " > > Is there a metadata package that has this kind of information in it? I have a > couple hundred such identifiers that I need to map to genes. I'd like to > be able to run > > getSYMBOL("R28020", "some_annotation_package") > > and get a useful result. Any ideas? > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor -- James W. MacDonald, M.S. Biostatistician University of Washington Environmental and Occupational Health Sciences 4225 Roosevelt Way NE, # 100 Seattle WA 98105-6099
ADD COMMENT
0
Entering edit mode
Hi Ed For this particular EST sequence, you find the annotation in UniGene. http://www.ncbi.nlm.nih.gov/unigene If you have many such EST sequences, I recommend to download the UniGene file: ftp://ftp.ncbi.nih.gov/repository/UniGene/Homo_sapiens/Hs.data.gz and do some horrible parsing (with your favorite parsing language)..... For "R28020", you will get: ID Hs.369017 TITLE RAB2A, member RAS oncogene family GENE RAB2A CYTOBAND 8q12.1 GENE_ID 5862 LOCUSLINK 5862 SEQUENCE ACC=R28020.1; NID=g784155; CLONE=IMAGE:133972; END=3'; LID=271; SEQTYPE=EST Regards, Hans-Rudolf On 10/03/2013 10:34 PM, James W. MacDonald wrote: > Hi Ed, > > Hypothetically you would want to use the org.Hs.eg.db package. However, > not all GenBank assession numbers will be annotated, presumably because > they have been retired. Alternately you could use biomaRt as well. > > However, the example ID you give is not annotated by either source. > > Best, > > Jim > > > > On Wednesday, October 02, 2013 4:05:51 PM, Ed Siefker wrote: >> What package would I need to transform Genbank accession numbers into >> gene symbols or entrez gene ids? e.g.If I search "R28020" on NCBI, it >> tells >> me that "This EST is one of 1366 sequences matched to RAB2A: RAB2A, >> member RAS oncogene family. " >> >> Is there a metadata package that has this kind of information in it? I >> have a >> couple hundred such identifiers that I need to map to genes. I'd like to >> be able to run >> >> getSYMBOL("R28020", "some_annotation_package") >> >> and get a useful result. Any ideas? >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at r-project.org >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: >> http://news.gmane.org/gmane.science.biology.informatics.conductor > > -- > James W. MacDonald, M.S. > Biostatistician > University of Washington > Environmental and Occupational Health Sciences > 4225 Roosevelt Way NE, # 100 > Seattle WA 98105-6099 > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor
ADD REPLY

Login before adding your answer.

Traffic: 556 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6