affy ids from gene symbols
2
0
Entering edit mode
@iain-gallagher-2532
Last seen 8.8 years ago
United Kingdom
An embedded and charset-unspecified text was scrubbed... Name: not available Url: https://stat.ethz.ch/pipermail/bioconductor/attachments/20080307/ 3935ba6d/attachment.pl
• 478 views
ADD COMMENT
0
Entering edit mode
@martin-morgan-1513
Last seen 15 days ago
United States
If you have a list of symbols and don't need to do any pattern matching, > syms = c("COPA") you can just > library(annotate) > rmap = revmap(getAnnMap("SYMBOL", "hgu95av2.db")) > mget(syms, rmap) $COPA [1] "36962_at" Reading Mark's email, I guess there is also > map = getAnnMap("ALIAS2PROBE", "hgu95av2.db") > mget(syms, map) Martin "James W. MacDonald" <jmacdon at="" med.umich.edu=""> writes: > Say your input vector of symbol names is called 'input'. > > complist <- vector("list", length(input)) > names(complist) <- input > library(hgu133plus2.db) ## note I am using the new package type!! > > mapp <- toTable(hgu133plus2SYMBOL) > > for(i in 1:length(input)) complist[[i]] <- mapp[grep(paste("^", > input[i], "$", sep=""), mapp[,2]),1] > > Depending on if you want to assume your symbols exactly match the > annotation package symbols, you might want to add in a tolower(), and > possibly gsub() to remove things like '(', ')', '-', etc. > > Best, > > Jim > > > > IAIN GALLAGHER wrote: >> Hello list. >> >> I would like to return the affymetrix probe ids for a list of genes. Normally I would do this through biomaRt but the service is down all weekend. >> >> I know the probe ids can be returned one at a time using regular expressions via >> >>> library(hgu133plus2) >>> symbols<-unlist(as.list(hgu133plus2SYMBOLS)) >>> gene1<-grep('^COPA$', symbols) >>> symbols[gene1] >> >> but I was wondering if there was a way to loop through the list of genes and 'grep' each one individually. >> >> Thanks for any advice. >> >> Iain >> >> [[alternative HTML version deleted]] >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at stat.math.ethz.ch >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor > > -- > James W. MacDonald, M.S. > Biostatistician > Affymetrix and cDNA Microarray Core > University of Michigan Cancer Center > 1500 E. Medical Center Drive > 7410 CCGC > Ann Arbor MI 48109 > 734-647-5623 > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor -- Martin Morgan Computational Biology / Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 Location: Arnold Building M2 B169 Phone: (206) 667-2793
ADD COMMENT
0
Entering edit mode
@james-w-macdonald-5106
Last seen 3 hours ago
United States
Say your input vector of symbol names is called 'input'. complist <- vector("list", length(input)) names(complist) <- input library(hgu133plus2.db) ## note I am using the new package type!! mapp <- toTable(hgu133plus2SYMBOL) for(i in 1:length(input)) complist[[i]] <- mapp[grep(paste("^", input[i], "$", sep=""), mapp[,2]),1] Depending on if you want to assume your symbols exactly match the annotation package symbols, you might want to add in a tolower(), and possibly gsub() to remove things like '(', ')', '-', etc. Best, Jim IAIN GALLAGHER wrote: > Hello list. > > I would like to return the affymetrix probe ids for a list of genes. Normally I would do this through biomaRt but the service is down all weekend. > > I know the probe ids can be returned one at a time using regular expressions via > >> library(hgu133plus2) >> symbols<-unlist(as.list(hgu133plus2SYMBOLS)) >> gene1<-grep('^COPA$', symbols) >> symbols[gene1] > > but I was wondering if there was a way to loop through the list of genes and 'grep' each one individually. > > Thanks for any advice. > > Iain > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor -- James W. MacDonald, M.S. Biostatistician Affymetrix and cDNA Microarray Core University of Michigan Cancer Center 1500 E. Medical Center Drive 7410 CCGC Ann Arbor MI 48109 734-647-5623
ADD COMMENT
0
Entering edit mode
James W. MacDonald wrote: > Say your input vector of symbol names is called 'input'. > > complist <- vector("list", length(input)) > names(complist) <- input > library(hgu133plus2.db) ## note I am using the new package type!! > > mapp <- toTable(hgu133plus2SYMBOL) > > for(i in 1:length(input)) complist[[i]] <- mapp[grep(paste("^", > input[i], "$", sep=""), mapp[,2]),1] > > Depending on if you want to assume your symbols exactly match the > annotation package symbols, you might want to add in a tolower(), and > possibly gsub() to remove things like '(', ')', '-', etc. > > Best, > > Jim > > > > IAIN GALLAGHER wrote: > >> Hello list. >> >> I would like to return the affymetrix probe ids for a list of genes. Normally I would do this through biomaRt but the service is down all weekend. >> >> I know the probe ids can be returned one at a time using regular expressions via >> >> >>> library(hgu133plus2) >>> symbols<-unlist(as.list(hgu133plus2SYMBOLS)) >>> gene1<-grep('^COPA$', symbols) >>> symbols[gene1] >>> >> but I was wondering if there was a way to loop through the list of genes and 'grep' each one individually. >> >> Thanks for any advice. >> >> Iain >> >> [[alternative HTML version deleted]] >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at stat.math.ethz.ch >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor >> > > Hi Iain, So if you have gene symbols and you want to match them to probes, you could also use the new "hgu133plus2ALIAS2PROBE" mapping found inside the hgu133plus2.db package. That will map ALL known gene symbols instead of just the most commonly used (standard) ones. This can be good if your list contains less common gene symbols for some of the genes that you are looking for. So to summarize we have two mapping that can help you: "hgu133plus2SYMBOL" which matches the most common "standard" gene symbol (only one per gene) to each probe. and "hgu133plus2ALIAS2PROBE" which matches ALL known gene symbols (known to NCBI) to each probe. The danger to using the 1st of these is that you will have an odd symbol name in your list you might not get a match. The danger to using the second one would happen if your gene symbol list had two different symbol names for one thing in it. In that case, you could match each of them and not know that you had hit the same symbol twice. Marc
ADD REPLY

Login before adding your answer.

Traffic: 605 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6