hgu133plus2 GO issues
3
0
Entering edit mode
@jacob-michaelson-1079
Last seen 10.2 years ago
Hi list, Could someone please help me understand the differences between the (hgu133plus2)GO, GO2PROBE, GO2ALLPROBES? I've found discepancies that I can't quite explain: > mget("GO:0042611", hgu133plus2GO2PROBE) Error: value for 'GO:0042611' not found > mget("GO:0042611", hgu133plus2GO2ALLPROBES) $"GO:0042611" <na> IEA IEA IEA <na> "209309_at" "217014_s_at" "210325_at" "218831_s_at" "1553402_a_at" <na> <na> <na> <na> <na> "206086_x_at" "206087_x_at" "210864_x_at" "211326_x_at" "211327_x_at" <na> <na> <na> <na> <na> "211328_x_at" "211329_x_at" "211330_s_at" "211331_x_at" "211332_x_at" <na> <na> <na> IEA <na> "211863_x_at" "211866_x_at" "214647_s_at" "235754_at" "213932_x_at" IEA <na> <na> IEA <na> "215313_x_at" "208729_x_at" "209140_x_at" "211911_x_at" "208812_x_at" <na> <na> IEA <na> <na> "211799_x_at" "214459_x_at" "216526_x_at" "200904_at" "200905_x_at" IEA <na> <na> IEA <na> "217456_x_at" "204806_x_at" "221875_x_at" "221978_at" "210514_x_at" <na> <na> <na> IEA IEA "211528_x_at" "211529_x_at" "211530_x_at" "217436_x_at" "231748_at" <na> IEA IEA IEA "221291_at" "238542_at" "221323_at" "1552777_a_at" and finally... ### "208729_x_at" is one of the probes returned with the above command > grep("GO:0042611",unlist(mget("208729_x_at", hgu133plus2GO))) numeric(0) "208729_x_at" is on the hgu133plus2 chip, but GO and GO2ALLPROBES don't map it to the same GO ID. Is there something wrong here or am I just missing something? If different, which is the most "reliable" mapping? I'm concerned because I went through to validate GO IDs I had gotten from the GOHyperG function (a total of 314), and 117 of those I could not map back to my significant probe list using the hgu133plus2GO annotation. I noticed by looking at the GOHyperG function that it uses information from GO2ALLPROBES. Any help/enlightenment is much appreciated. PS - using R 2.2.1 with hgu133plus2 1.10.0 --Jake
Annotation GO hgu133plus2 probe Annotation GO hgu133plus2 probe • 1.4k views
ADD COMMENT
0
Entering edit mode
Seth Falcon ★ 7.4k
@seth-falcon-992
Last seen 10.2 years ago
Hi Jake, Jake <jjmichael at="" comcast.net=""> writes: > Could someone please help me understand the differences between the > (hgu133plus2)GO, GO2PROBE, GO2ALLPROBES? I've found discepancies that I > can't quite explain: > > > mget("GO:0042611", hgu133plus2GO2PROBE) > Error: value for 'GO:0042611' not found GO annotates probe ids (really Entrez Gene ids) at the most specific term in the GO ontology. In the above search of hgu133plus2GO2PROBE, you are seeing that GO:0042611 does not have any annotations. >> mget("GO:0042611", hgu133plus2GO2ALLPROBES) > $"GO:0042611" > <na> IEA IEA IEA > <na> > "209309_at" "217014_s_at" "210325_at" "218831_s_at" [snip] For a given GO term, the hgu133plus2GO2ALLPROBES environment is giving you all Affy ids that map to this GO term _or_ a more specific term that is related to this term (by related, I mean child-like relation, where there is a path in the DAG connecting the terms). The names on the vector are evidence codes. See the man pages for details. So for the above two cases, this is as expected and I don't think there is any inconsistency. > and finally... > > ### "208729_x_at" is one of the probes returned with the above command >> grep("GO:0042611",unlist(mget("208729_x_at", hgu133plus2GO))) > numeric(0) When you say "above command", which one are you referring to? hgu133plus2GO should be the inverse map for hgu133plus2GO2PROBE. > "208729_x_at" is on the hgu133plus2 chip, but GO and GO2ALLPROBES don't > map it to the same GO ID. Can you be more specific? Which env in the GO package are you talking about. Note that GO2ALLPROBES does not map to GO ids, it maps _from_ GO ids. You can ask which GO ids have the 208729_x_at annotation using hgu133plus2GO. If you then grep through hgu133plus2GO2ALLPROBES for GO ids that have 208729_x_at in their probe vector, then you should find more GO ids because you are picking up parent terms that don't have the specific annotation. However, all the ids you found in hgu133plus2GO should appear. Clear as mud? :-) > Is there something wrong here or am I just missing something? If > different, which is the most "reliable" mapping? I'm concerned because > I went through to validate GO IDs I had gotten from the GOHyperG > function (a total of 314), and 117 of those I could not map back to my > significant probe list using the hgu133plus2GO annotation. I noticed by > looking at the GOHyperG function that it uses information from > GO2ALLPROBES. > > Any help/enlightenment is much appreciated. > > PS - using R 2.2.1 with hgu133plus2 1.10.0 PS: sessionInfo() would be a better way to report that. Then we would also know your version of the GO package, for example. + seth
ADD COMMENT
0
Entering edit mode
On Tue, 2006-04-18 at 09:37 -0700, Seth Falcon wrote: > Hi Jake, > > Jake <jjmichael at="" comcast.net=""> writes: > > Could someone please help me understand the differences between the > > (hgu133plus2)GO, GO2PROBE, GO2ALLPROBES? I've found discepancies that I > > can't quite explain: > > > > > mget("GO:0042611", hgu133plus2GO2PROBE) > > Error: value for 'GO:0042611' not found > > GO annotates probe ids (really Entrez Gene ids) at the most specific > term in the GO ontology. In the above search of hgu133plus2GO2PROBE, > you are seeing that GO:0042611 does not have any annotations. > > > >> mget("GO:0042611", hgu133plus2GO2ALLPROBES) > > $"GO:0042611" > > <na> IEA IEA IEA > > <na> > > "209309_at" "217014_s_at" "210325_at" "218831_s_at" > [snip] > > For a given GO term, the hgu133plus2GO2ALLPROBES environment is giving > you all Affy ids that map to this GO term _or_ a more specific term > that is related to this term (by related, I mean child-like relation, > where there is a path in the DAG connecting the terms). > > The names on the vector are evidence codes. See the man pages for > details. > > So for the above two cases, this is as expected and I don't think > there is any inconsistency. > > > and finally... > > > > ### "208729_x_at" is one of the probes returned with the above command > >> grep("GO:0042611",unlist(mget("208729_x_at", hgu133plus2GO))) > > numeric(0) > > When you say "above command", which one are you referring to? > hgu133plus2GO should be the inverse map for hgu133plus2GO2PROBE. > > > "208729_x_at" is on the hgu133plus2 chip, but GO and GO2ALLPROBES don't > > map it to the same GO ID. > > Can you be more specific? Which env in the GO package are you talking > about. Note that GO2ALLPROBES does not map to GO ids, it maps _from_ > GO ids. > > You can ask which GO ids have the 208729_x_at annotation using > hgu133plus2GO. > > If you then grep through hgu133plus2GO2ALLPROBES for GO ids that have > 208729_x_at in their probe vector, then you should find more GO ids > because you are picking up parent terms that don't have the specific > annotation. However, all the ids you found in hgu133plus2GO should > appear. > > Clear as mud? :-) > > > Is there something wrong here or am I just missing something? If > > different, which is the most "reliable" mapping? I'm concerned because > > I went through to validate GO IDs I had gotten from the GOHyperG > > function (a total of 314), and 117 of those I could not map back to my > > significant probe list using the hgu133plus2GO annotation. I noticed by > > looking at the GOHyperG function that it uses information from > > GO2ALLPROBES. > > > > Any help/enlightenment is much appreciated. > > > > PS - using R 2.2.1 with hgu133plus2 1.10.0 > > PS: sessionInfo() would be a better way to report that. Then we would > also know your version of the GO package, for example. > > + seth Thanks for all the help, guys - really helped my understanding as to how the GO mappings work in the context of BioC. I had previously assumed that mappings in all the GO environments were multi-level, and now I know that really on the GO2ALLPROBES environment is. Jim- sorry for personally replying to you -meant to send to the list but I frequently hit "reply" instead of "reply to all" on accident. --Jake
ADD REPLY
0
Entering edit mode
@james-w-macdonald-5106
Last seen 7 hours ago
United States
Hi Jake, Jake wrote: > Hi list, > > Could someone please help me understand the differences between the > (hgu133plus2)GO, GO2PROBE, GO2ALLPROBES? I've found discepancies that I > can't quite explain: > > > mget("GO:0042611", hgu133plus2GO2PROBE) > Error: value for 'GO:0042611' not found > > >>mget("GO:0042611", hgu133plus2GO2ALLPROBES) > > $"GO:0042611" > <na> IEA IEA IEA > <na> > "209309_at" "217014_s_at" "210325_at" "218831_s_at" > "1553402_a_at" > <na> <na> <na> <na> > <na> > "206086_x_at" "206087_x_at" "210864_x_at" "211326_x_at" > "211327_x_at" > <na> <na> <na> <na> > <na> > "211328_x_at" "211329_x_at" "211330_s_at" "211331_x_at" > "211332_x_at" > <na> <na> <na> IEA > <na> > "211863_x_at" "211866_x_at" "214647_s_at" "235754_at" > "213932_x_at" > IEA <na> <na> IEA > <na> > "215313_x_at" "208729_x_at" "209140_x_at" "211911_x_at" > "208812_x_at" > <na> <na> IEA <na> > <na> > "211799_x_at" "214459_x_at" "216526_x_at" "200904_at" > "200905_x_at" > IEA <na> <na> IEA > <na> > "217456_x_at" "204806_x_at" "221875_x_at" "221978_at" > "210514_x_at" > <na> <na> <na> IEA > IEA > "211528_x_at" "211529_x_at" "211530_x_at" "217436_x_at" > "231748_at" > <na> IEA IEA IEA > "221291_at" "238542_at" "221323_at" "1552777_a_at" > > and finally... > > ### "208729_x_at" is one of the probes returned with the above command > >>grep("GO:0042611",unlist(mget("208729_x_at", hgu133plus2GO))) > > numeric(0) > > > > "208729_x_at" is on the hgu133plus2 chip, but GO and GO2ALLPROBES don't > map it to the same GO ID. > > Is there something wrong here or am I just missing something? If > different, which is the most "reliable" mapping? I'm concerned because > I went through to validate GO IDs I had gotten from the GOHyperG > function (a total of 314), and 117 of those I could not map back to my > significant probe list using the hgu133plus2GO annotation. I noticed by > looking at the GOHyperG function that it uses information from > GO2ALLPROBES. Here is the difference: hgu133plus2GO maps Probe IDs to GO terms hgu133plus2GO2 PROBE maps GO terms to Probe IDs hgu133plus2GO2ALLPROBES maps GO terms and all children of the terms to Probe IDs So there isn't really an issue of reliability here, just an issue of what you want. In your case, 208729_x_at doesn't map to GO:0042611, but it does map to children of that GO term (for instance GO:0042612). sapply(get("208729_x_at", hgu133plus2GO), function(x) x[[1]]) GO:0005624 GO:0005887 GO:0016020 GO:0016021 GO:0019882 GO:0019883 "GO:0005624" "GO:0005887" "GO:0016020" "GO:0016021" "GO:0019882" "GO:0019883" GO:0019885 GO:0030106 GO:0030106 GO:0042612 "GO:0019885" "GO:0030106" "GO:0030106" "GO:0042612" > grep("208729_x_at",get("GO:0042612", hgu133plus2GO2PROBE)) [1] 20 > grep("208729_x_at",get("GO:0042611", hgu133plus2GO2PROBE)) Error in get(x, envir, mode, inherits) : variable "GO:0042611" was not found > grep("208729_x_at",get("GO:0042611", hgu133plus2GO2ALLPROBES)) [1] 20 HTH, Jim > > Any help/enlightenment is much appreciated. > > PS - using R 2.2.1 with hgu133plus2 1.10.0 > > --Jake > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor -- James W. MacDonald, M.S. Biostatistician Affymetrix and cDNA Microarray Core University of Michigan Cancer Center 1500 E. Medical Center Drive 7410 CCGC Ann Arbor MI 48109 734-647-5623 ********************************************************** Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues.
ADD COMMENT
0
Entering edit mode
@james-w-macdonald-5106
Last seen 7 hours ago
United States
Jake wrote: >>Here is the difference: >> >>hgu133plus2GO maps Probe IDs to GO terms >>hgu133plus2GO2 PROBE maps GO terms to Probe IDs >>hgu133plus2GO2ALLPROBES maps GO terms and all children of the terms to >>Probe IDs > > > Thanks for the quick response, Jim. I just want to make sure that I'm > understanding this correctly: > > The "children" are more specific descriptions/functions of the "parent" > node (right?). So are you saying that even if an Affy Probe ID only has > evidence for a given parent node, GO2ALLPROBES will also include > connected children nodes for which there is no evidence for that Affy > ID? Nope, you have that backwards. GO2ALLPROBES maps GO terms to AffyIDs. So if you have a GO term, say phosphorylation, and there aren't any AffyIDs that map to that particular GOID, GO2PROBE won't list anything. However, if there is an AffyID that maps to protein phosphorylation, which is a child term of phosphorylation, then GO2ALLPROBES will list that AffyID when you do a get() on the phosphorylation GOID. What you are talking about is the mapping in the hgu133plus2GO environment. In that case, if a given AffyID maps to e.g., phosphorylation, that is the only GOID that will be returned, not that term and all its children. As an aside, please don't respond just to me. Keep things on the list so the questions/answers can be found by others. HTH, Jim > > Thanks, > > Jake > -- James W. MacDonald, M.S. Biostatistician Affymetrix and cDNA Microarray Core University of Michigan Cancer Center 1500 E. Medical Center Drive 7410 CCGC Ann Arbor MI 48109 734-647-5623 ********************************************************** Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues.
ADD COMMENT

Login before adding your answer.

Traffic: 780 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6