How to check if gene name is an alias or misspelt
2
0
Entering edit mode
Daniel Brewer ★ 1.9k
@daniel-brewer-1791
Last seen 10.6 years ago
Hello, I have a list of genes which are not official gene symbols. Normally in this case I would search gene in entrez to see if it is an alias and then take the official symbol. Is there a way to (semi) automate this within bioconductor? If this fails I normally google it to see if it is likely to be a misspelling S instead of 5 etc. ANy suggestions for that? Many thanks Dan -- ************************************************************** Daniel Brewer, Ph.D. Institute of Cancer Research Molecular Carcinogenesis Email: daniel.brewer at icr.ac.uk ************************************************************** The Institute of Cancer Research: Royal Cancer Hospital, a charitable Company Limited by Guarantee, Registered in England under Company No. 534147 with its Registered Office at 123 Old Brompton Road, London SW7 3RP. This e-mail message is confidential and for use by the a...{{dropped:2}}
Cancer Cancer • 1.2k views
ADD COMMENT
0
Entering edit mode
@sean-davis-490
Last seen 6 weeks ago
United States
On Wed, Apr 8, 2009 at 9:52 AM, Daniel Brewer <daniel.brewer@icr.ac.uk>wrote: > Hello, > > I have a list of genes which are not official gene symbols. Normally in > this case I would search gene in entrez to see if it is an alias and > then take the official symbol. Is there a way to (semi) automate this > within bioconductor? > > If this fails I normally google it to see if it is likely to be a > misspelling S instead of 5 etc. ANy suggestions for that? > It is often a good idea to check the annotation packages for this type of thing. For the org.XX.eg.db (XX represents the organism of interest) packages, there is the org.XX.egALIAS2EG table that maps aliases to entrez gene. Sean [[alternative HTML version deleted]]
ADD COMMENT
0
Entering edit mode
@herve-pages-1542
Last seen 2 days ago
Seattle, WA, United States
Hi Dan, The org.XX.egALIAS2EG map combined with some fuzzy matching function can help you do this: > library(org.Hs.eg.db) > get("S-HT3c2", org.Hs.egALIAS2EG) Error in .checkKeys(value, Rkeys(x), x at ifnotfound) : value for "S-HT3c2" not found > agrep("S-HT3c2", keys(org.Hs.egALIAS2EG), value=TRUE, max.distance=1) [1] "5-HT3c2" The 'max.distance argument' lets you control the max number of misspelling letters (including inserted/deleted letters): > get("WUGSC:H-DJO747G182", org.Hs.egALIAS2EG) Error in .checkKeys(value, Rkeys(x), x at ifnotfound) : value for "WUGSC:H-DJO747G182" not found > agrep("WUGSC:H-DJO747G182", keys(org.Hs.egALIAS2EG), value=TRUE, max.distance=2) character(0) > agrep("WUGSC:H-DJO747G182", keys(org.Hs.egALIAS2EG), value=TRUE, max.distance=3) [1] "WUGSC:H_DJ0747G18.2" Cheers, H. Daniel Brewer wrote: > Hello, > > I have a list of genes which are not official gene symbols. Normally in > this case I would search gene in entrez to see if it is an alias and > then take the official symbol. Is there a way to (semi) automate this > within bioconductor? > > If this fails I normally google it to see if it is likely to be a > misspelling S instead of 5 etc. ANy suggestions for that? > > Many thanks > > Dan > -- Hervé Pagès Program in Computational Biology Division of Public Health Sciences Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N, M2-B876 P.O. Box 19024 Seattle, WA 98109-1024 E-mail: hpages at fhcrc.org Phone: (206) 667-5791 Fax: (206) 667-1319
ADD COMMENT

Login before adding your answer.

Traffic: 545 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6