Hi
The golub dataset has identifiers which I can only assume are based on
GenBank/EMBL accession numbers (Eg M71243_f_at, U29175_at etc).
I want to enrich the annotation for this data set by mapping these
identifiers to GO terms, KEGG pathways etc but I can't figure out how
to
do it using bioconductor.
Can anyone give me a few tips?
Mick
This dataset is made with affymetrix hu6800 chips, and it uses
affymetrix identifiers. Therefore you can use the hu6800 metadata
package to find the information you need.
Jan
> -----Original Message-----
> From: bioconductor-bounces@stat.math.ethz.ch
> [mailto:bioconductor-bounces@stat.math.ethz.ch]On Behalf Of michael
> watson (IAH-C)
> Sent: vrijdag 4 februari 2005 15:05
> To: bioconductor@stat.math.ethz.ch
> Subject: [BioC] Getting GO terms and other annotation for
> Golub data set
>
>
> Hi
>
> The golub dataset has identifiers which I can only assume are based
on
> GenBank/EMBL accession numbers (Eg M71243_f_at, U29175_at etc).
>
> I want to enrich the annotation for this data set by mapping these
> identifiers to GO terms, KEGG pathways etc but I can't figure
> out how to
> do it using bioconductor.
>
> Can anyone give me a few tips?
>
> Mick
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor@stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
>
On Feb 4, 2005, at 9:05 AM, michael watson ((IAH-C)) wrote:
> Hi
>
> The golub dataset has identifiers which I can only assume are based
on
> GenBank/EMBL accession numbers (Eg M71243_f_at, U29175_at etc).
>
These are affy id's, I think from the Hgu95a array? In any case, you
can use the annotate,GOstats, etc. packages with the appropriate
annotation package (Like I said, I think 95a array). Check out the
vignette on using the annotate package as a start.
> I want to enrich the annotation for this data set by mapping these
> identifiers to GO terms, KEGG pathways etc but I can't figure out
how
> to
> do it using bioconductor.
>
> Can anyone give me a few tips?
>
> Mick
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor@stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
Hi
Thanks for all of the replies!
I am getting there slowly - please someone point to an obvious
tutorial
if I have missed it! I've read the annotate vignette...
I have the metaData package for hu6800, GO, and KEGG. And I can do
stuff like this:
l <- get("X65663_at",env=hu6800GO)
names(l)
get("GO:0006325", env=GOTERM)
What I want to do is look at groups of genes that I have found and see
if they make sense. By make sense, I mean "do they have similar or
related functions", "do they appear in the same pathway" etc etc.
So now I have the GO and KEGG metaData packages, and I know how to
query
them at a very low level. My next step will be to write some code to
take a group of affy identifiers, query these packages and see if they
all seem to hit the same KEGG pathway, or have GO terms in common.
Has
anyone done this before and put it in a nice package, or do I write it
from scratch?
Cheers
Mick
-----Original Message-----
From: Sean Davis [mailto:sdavis2@mail.nih.gov]
Sent: 04 February 2005 14:31
To: michael watson (IAH-C)
Cc: bioconductor@stat.math.ethz.ch
Subject: Re: [BioC] Getting GO terms and other annotation for Golub
data
set
On Feb 4, 2005, at 9:05 AM, michael watson ((IAH-C)) wrote:
> Hi
>
> The golub dataset has identifiers which I can only assume are based
on
> GenBank/EMBL accession numbers (Eg M71243_f_at, U29175_at etc).
>
These are affy id's, I think from the Hgu95a array? In any case, you
can use the annotate,GOstats, etc. packages with the appropriate
annotation package (Like I said, I think 95a array). Check out the
vignette on using the annotate package as a start.
> I want to enrich the annotation for this data set by mapping these
> identifiers to GO terms, KEGG pathways etc but I can't figure out
how
> to do it using bioconductor.
>
> Can anyone give me a few tips?
>
> Mick
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor@stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
On Feb 7, 2005, at 5:21 AM, michael watson ((IAH-C)) wrote:
>
> So now I have the GO and KEGG metaData packages, and I know how to
> query
> them at a very low level. My next step will be to write some code
to
> take a group of affy identifiers, query these packages and see if
they
> all seem to hit the same KEGG pathway, or have GO terms in common.
Has
> anyone done this before and put it in a nice package, or do I write
it
> from scratch?
>
The GOstats package does this for GO. You could use that directly for
the GO stuff. For the KEGG stuff, I think you would have to write
something, but it shouldn't be too hard. Also, there are now several
websites that allow you to do this (not an R solution, but perhaps a
good first-pass).
Sean
Thanks Sean.
I don't know whether my installation of GOstats has gone awry at some
point, but there doesn't seem to be a manual or vignette for the
package
in C:\Program Files\R\rw2001\library\GOstats\doc. There is an
index.html but it is next to empty.
So if I have two affy ids, I can get the GO terms easily, but it's not
immediately obvious which functions in GOstats I could use to to see
if
those two affy ids are related in a functional sense.
-----Original Message-----
From: Sean Davis [mailto:sdavis2@mail.nih.gov]
Sent: 07 February 2005 13:05
To: michael watson (IAH-C)
Cc: bioconductor@stat.math.ethz.ch
Subject: Re: [BioC] Getting GO terms and other annotation for Golub
data
set
On Feb 7, 2005, at 5:21 AM, michael watson ((IAH-C)) wrote:
>
> So now I have the GO and KEGG metaData packages, and I know how to
> query
> them at a very low level. My next step will be to write some code
to
> take a group of affy identifiers, query these packages and see if
they
> all seem to hit the same KEGG pathway, or have GO terms in common.
Has
> anyone done this before and put it in a nice package, or do I write
it
> from scratch?
>
The GOstats package does this for GO. You could use that directly for
the GO stuff. For the KEGG stuff, I think you would have to write
something, but it shouldn't be too hard. Also, there are now several
websites that allow you to do this (not an R solution, but perhaps a
good first-pass).
Sean
Mick,
GOHyperG is a useful function for that. Also, the vignettes are
available at http://www.bioconductor.org. Finally, to get a list of
all the functions in a package, do:
help(package=GOstats)
for example.
Sean
On Feb 7, 2005, at 8:20 AM, michael watson ((IAH-C)) wrote:
> Thanks Sean.
>
> I don't know whether my installation of GOstats has gone awry at
some
> point, but there doesn't seem to be a manual or vignette for the
> package
> in C:\Program Files\R\rw2001\library\GOstats\doc. There is an
> index.html but it is next to empty.
>
> So if I have two affy ids, I can get the GO terms easily, but it's
not
> immediately obvious which functions in GOstats I could use to to see
if
> those two affy ids are related in a functional sense.
>
> -----Original Message-----
> From: Sean Davis [mailto:sdavis2@mail.nih.gov]
> Sent: 07 February 2005 13:05
> To: michael watson (IAH-C)
> Cc: bioconductor@stat.math.ethz.ch
> Subject: Re: [BioC] Getting GO terms and other annotation for Golub
> data
> set
>
>
>
> On Feb 7, 2005, at 5:21 AM, michael watson ((IAH-C)) wrote:
>>
>> So now I have the GO and KEGG metaData packages, and I know how to
>> query
>> them at a very low level. My next step will be to write some code
to
>> take a group of affy identifiers, query these packages and see if
they
>> all seem to hit the same KEGG pathway, or have GO terms in common.
> Has
>> anyone done this before and put it in a nice package, or do I write
it
>> from scratch?
>>
>
> The GOstats package does this for GO. You could use that directly
for
> the GO stuff. For the KEGG stuff, I think you would have to write
> something, but it shouldn't be too hard. Also, there are now
several
> websites that allow you to do this (not an R solution, but perhaps a
> good first-pass).
>
> Sean