package for collapsing probe id to entrezid or gene symbols
2
0
Entering edit mode
Wendy Qiao ▴ 360
@wendy-qiao-4501
Last seen 10.2 years ago
Hi all, I am searching for a Bioconductor package that can collapse affymetrix probe id to gene symbols or entrez id, but I have not had any luck yet. Does anyone know any package that can collapse the probe id to gene symbols? Thank you in advance, Wendy [[alternative HTML version deleted]]
probe probe • 4.0k views
ADD COMMENT
0
Entering edit mode
@james-w-macdonald-5106
Last seen 3 minutes ago
United States
Hi Wendy, On 3/15/2011 11:30 AM, Wendy Qiao wrote: > Hi all, > > I am searching for a Bioconductor package that can collapse affymetrix probe > id to gene symbols or entrez id, but I have not had any luck yet. Does > anyone know any package that can collapse the probe id to gene symbols? What exactly do you mean by 'collapse the probe id to gene symbol'? There are annotation packages for pretty much all of the Affy chips that will map probesets to either Entrez Gene ID or gene symbol (e.g., for the HG-U133Plus2 chip we have the hgu133plus2.db package). These provide the mappings in a simple way, using either get() or mget(): > get("1007_s_at", hgu95av2ENTREZID) [1] "780" > get("1007_s_at", hgu95av2GENENAME) [1] "discoidin domain receptor tyrosine kinase 1" > get("1007_s_at", hgu95av2SYMBOL) [1] "DDR1" But maybe you are looking for something else? Best, Jim > > Thank you in advance, > Wendy > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor -- James W. MacDonald, M.S. Biostatistician Douglas Lab University of Michigan Department of Human Genetics 5912 Buhl 1241 E. Catherine St. Ann Arbor MI 48109-5618 734-615-7826 ********************************************************** Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues
ADD COMMENT
0
Entering edit mode
Thank you, James. Sorry, I did not make my question clear. What I meant is that, on a chip, there are multiple probes for the same gene. I was wondering if there is a package or command that can combine the expression values on multiple probes for the same gene to one value by taking the median or average. It is not hard to write the code for doing this, but I was just wondering if there is existing package for doing this. Wendy On 15 March 2011 12:03, James W. MacDonald <jmacdon@med.umich.edu> wrote: > Hi Wendy, > > > On 3/15/2011 11:30 AM, Wendy Qiao wrote: > >> Hi all, >> >> I am searching for a Bioconductor package that can collapse affymetrix >> probe >> id to gene symbols or entrez id, but I have not had any luck yet. Does >> anyone know any package that can collapse the probe id to gene symbols? >> > > What exactly do you mean by 'collapse the probe id to gene symbol'? There > are annotation packages for pretty much all of the Affy chips that will map > probesets to either Entrez Gene ID or gene symbol (e.g., for the > HG-U133Plus2 chip we have the hgu133plus2.db package). These provide the > mappings in a simple way, using either get() or mget(): > > > get("1007_s_at", hgu95av2ENTREZID) > [1] "780" > > get("1007_s_at", hgu95av2GENENAME) > [1] "discoidin domain receptor tyrosine kinase 1" > > get("1007_s_at", hgu95av2SYMBOL) > [1] "DDR1" > > But maybe you are looking for something else? > > Best, > > Jim > > > >> Thank you in advance, >> Wendy >> >> [[alternative HTML version deleted]] >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor@r-project.org >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: >> http://news.gmane.org/gmane.science.biology.informatics.conductor >> > > -- > James W. MacDonald, M.S. > Biostatistician > Douglas Lab > University of Michigan > Department of Human Genetics > 5912 Buhl > 1241 E. Catherine St. > Ann Arbor MI 48109-5618 > 734-615-7826 > ********************************************************** > Electronic Mail is not secure, may not be read every day, and should not be > used for urgent or sensitive issues > [[alternative HTML version deleted]]
ADD REPLY
0
Entering edit mode
This is one option: http://brainarray.mbni.med.umich.edu/Brainarray/Database/CustomCDF/gen omic_curated_CDF.asp Paul. On Tue, Mar 15, 2011 at 4:43 PM, Wendy Qiao <wendy2.qiao at="" gmail.com=""> wrote: > Thank you, James. > > Sorry, I did not make my question clear. > > What I meant is that, on a chip, there are multiple probes for the same > gene. I was wondering if there is a package or command that can combine the > expression values on multiple probes for the same gene to one value by > taking the median or average. It is not hard to write the code for doing > this, but I was just wondering if there is existing package for doing this. > > Wendy > > On 15 March 2011 12:03, James W. MacDonald <jmacdon at="" med.umich.edu=""> wrote: > >> Hi Wendy, >> >> >> On 3/15/2011 11:30 AM, Wendy Qiao wrote: >> >>> Hi all, >>> >>> I am searching for a Bioconductor package that can collapse affymetrix >>> probe >>> id to gene symbols or entrez id, but I have not had any luck yet. Does >>> anyone know any package that can collapse the probe id to gene symbols? >>> >> >> What exactly do you mean by 'collapse the probe id to gene symbol'? There >> are annotation packages for pretty much all of the Affy chips that will map >> probesets to either Entrez Gene ID or gene symbol (e.g., for the >> HG-U133Plus2 chip we have the hgu133plus2.db package). These provide the >> mappings in a simple way, using either get() or mget(): >> >> > get("1007_s_at", hgu95av2ENTREZID) >> [1] "780" >> > get("1007_s_at", hgu95av2GENENAME) >> [1] "discoidin domain receptor tyrosine kinase 1" >> > get("1007_s_at", hgu95av2SYMBOL) >> [1] "DDR1" >> >> But maybe you are looking for something else? >> >> Best, >> >> Jim >> >> >> >>> Thank you in advance, >>> Wendy >>> >>> ? ? ? ?[[alternative HTML version deleted]] >>> >>> _______________________________________________ >>> Bioconductor mailing list >>> Bioconductor at r-project.org >>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>> Search the archives: >>> http://news.gmane.org/gmane.science.biology.informatics.conductor >>> >> >> -- >> James W. MacDonald, M.S. >> Biostatistician >> Douglas Lab >> University of Michigan >> Department of Human Genetics >> 5912 Buhl >> 1241 E. Catherine St. >> Ann Arbor MI 48109-5618 >> 734-615-7826 >> ********************************************************** >> Electronic Mail is not secure, may not be read every day, and should not be >> used for urgent or sensitive issues >> > > ? ? ? ?[[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor > -- Paul Geeleher (PhD Student) School of Mathematics, Statistics and Applied Mathematics National University of Ireland Galway Ireland -- www.bioinformaticstutorials.com
ADD REPLY
0
Entering edit mode
Hi Wendy, for something like this the function aggregate should do the job. type ?aggregate in R for more details. Yong Li Wendy Qiao wrote: > Thank you, James. > > Sorry, I did not make my question clear. > > What I meant is that, on a chip, there are multiple probes for the same > gene. I was wondering if there is a package or command that can combine the > expression values on multiple probes for the same gene to one value by > taking the median or average. It is not hard to write the code for doing > this, but I was just wondering if there is existing package for doing this. > > Wendy > > On 15 March 2011 12:03, James W. MacDonald <jmacdon at="" med.umich.edu=""> wrote: > >> Hi Wendy, >> >> >> On 3/15/2011 11:30 AM, Wendy Qiao wrote: >> >>> Hi all, >>> >>> I am searching for a Bioconductor package that can collapse affymetrix >>> probe >>> id to gene symbols or entrez id, but I have not had any luck yet. Does >>> anyone know any package that can collapse the probe id to gene symbols? >>> >> What exactly do you mean by 'collapse the probe id to gene symbol'? There >> are annotation packages for pretty much all of the Affy chips that will map >> probesets to either Entrez Gene ID or gene symbol (e.g., for the >> HG-U133Plus2 chip we have the hgu133plus2.db package). These provide the >> mappings in a simple way, using either get() or mget(): >> >>> get("1007_s_at", hgu95av2ENTREZID) >> [1] "780" >>> get("1007_s_at", hgu95av2GENENAME) >> [1] "discoidin domain receptor tyrosine kinase 1" >>> get("1007_s_at", hgu95av2SYMBOL) >> [1] "DDR1" >> >> But maybe you are looking for something else? >> >> Best, >> >> Jim >> >> >> >>> Thank you in advance, >>> Wendy >>> >>> [[alternative HTML version deleted]] >>> >>> _______________________________________________ >>> Bioconductor mailing list >>> Bioconductor at r-project.org >>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>> Search the archives: >>> http://news.gmane.org/gmane.science.biology.informatics.conductor >>> >> -- >> James W. MacDonald, M.S. >> Biostatistician >> Douglas Lab >> University of Michigan >> Department of Human Genetics >> 5912 Buhl >> 1241 E. Catherine St. >> Ann Arbor MI 48109-5618 >> 734-615-7826 >> ********************************************************** >> Electronic Mail is not secure, may not be read every day, and should not be >> used for urgent or sensitive issues >> > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor >
ADD REPLY
0
Entering edit mode
@herve-pages-1542
Last seen 2 days ago
Seattle, WA, United States
Hi Wendy, FWIW, it seems to me that you can easily infer the mapping from probes to Entrez ids by combining the information stored in the <platform>probe and <platform>.db packages for a given platform. For example, for the hgu95av2 platform, the hgu95av2probe package contains the mapping between probes and probe set ids: > library(hgu95av2probe) > head(as.data.frame(hgu95av2probe)) sequence x y Probe.Set.Name Probe.Interrogation.Position 1 TGGCTCCTGCTGAGGTCCCCTTTCC 395 301 1138_at 2631 2 GGCTGTGAATTCCTGTACATATTTC 322 441 1138_at 2661 3 GCTTCAATTCCATTATGTTTTAATG 213 419 1138_at 2703 4 GCCGTTTGACAGAGCATGCTCTGCG 279 435 1138_at 2781 5 TGACAGAGCATGCTCTGCGTTGTTG 473 299 1138_at 2787 6 CTCTGCGTTGTTGGTTTCACCAGCT 587 205 1138_at 2799 Target.Strandedness 1 Antisense 2 Antisense 3 Antisense 4 Antisense 5 Antisense 6 Antisense (Note that, unlike the probe sets, the probes don't have ids, but are uniquely identified by their x and y coordinates on the array.) And the hgu95av2.db package contains the mapping between probe set ids and Entrez ids: > library(hgu95av2.db) > get("1138_at", hgu95av2ENTREZID) [1] "6574" > mget(keys(hgu95av2ENTREZID)[1:5], hgu95av2ENTREZID) $`1000_at` [1] "5595" $`1001_at` [1] "7075" $`1002_f_at` [1] "1557" $`1003_s_at` [1] "643" $`1004_at` [1] "643" Cheers, H. On 03/15/2011 08:30 AM, Wendy Qiao wrote: > Hi all, > > I am searching for a Bioconductor package that can collapse affymetrix probe > id to gene symbols or entrez id, but I have not had any luck yet. Does > anyone know any package that can collapse the probe id to gene symbols? > > Thank you in advance, > Wendy > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor -- Hervé Pagès Program in Computational Biology Division of Public Health Sciences Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N, M2-B876 P.O. Box 19024 Seattle, WA 98109-1024 E-mail: hpages at fhcrc.org Phone: (206) 667-5791 Fax: (206) 667-1319
ADD COMMENT
0
Entering edit mode
Hi all, Thanks a lot for your helps. Wendy 2011/3/21 Hervé Pagès <hpages@fhcrc.org> > Hi Wendy, > > FWIW, it seems to me that you can easily infer the mapping from > probes to Entrez ids by combining the information stored in the > <platform>probe and <platform>.db packages for a given platform. > > For example, for the hgu95av2 platform, the hgu95av2probe package > contains the mapping between probes and probe set ids: > > > library(hgu95av2probe) > > head(as.data.frame(hgu95av2probe)) > sequence x y Probe.Set.NameProbe.Interrogation.Position > 1 TGGCTCCTGCTGAGGTCCCCTTTCC 395 301 1138_at 2631 > 2 GGCTGTGAATTCCTGTACATATTTC 322 441 1138_at 2661 > 3 GCTTCAATTCCATTATGTTTTAATG 213 419 1138_at 2703 > 4 GCCGTTTGACAGAGCATGCTCTGCG 279 435 1138_at 2781 > 5 TGACAGAGCATGCTCTGCGTTGTTG 473 299 1138_at 2787 > 6 CTCTGCGTTGTTGGTTTCACCAGCT 587 205 1138_at 2799 > Target.Strandedness > 1 Antisense > 2 Antisense > 3 Antisense > 4 Antisense > 5 Antisense > 6 Antisense > > (Note that, unlike the probe sets, the probes don't have ids, but are > uniquely identified by their x and y coordinates on the array.) > > And the hgu95av2.db package contains the mapping between probe set ids > and Entrez ids: > > > library(hgu95av2.db) > > get("1138_at", hgu95av2ENTREZID) > [1] "6574" > > mget(keys(hgu95av2ENTREZID)[1:5], hgu95av2ENTREZID) > $`1000_at` > [1] "5595" > > $`1001_at` > [1] "7075" > > $`1002_f_at` > [1] "1557" > > $`1003_s_at` > [1] "643" > > $`1004_at` > [1] "643" > > Cheers, > H. > > > > On 03/15/2011 08:30 AM, Wendy Qiao wrote: > >> Hi all, >> >> I am searching for a Bioconductor package that can collapse affymetrix >> probe >> id to gene symbols or entrez id, but I have not had any luck yet. Does >> anyone know any package that can collapse the probe id to gene symbols? >> >> Thank you in advance, >> Wendy >> >> [[alternative HTML version deleted]] >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor@r-project.org >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: >> http://news.gmane.org/gmane.science.biology.informatics.conductor >> > > > -- > Hervé Pagès > > Program in Computational Biology > Division of Public Health Sciences > Fred Hutchinson Cancer Research Center > 1100 Fairview Ave. N, M2-B876 > P.O. Box 19024 > Seattle, WA 98109-1024 > > E-mail: hpages@fhcrc.org > Phone: (206) 667-5791 > Fax: (206) 667-1319 > [[alternative HTML version deleted]]
ADD REPLY

Login before adding your answer.

Traffic: 832 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6