GO over representation analysis

0

Entering edit mode

Heike Pospisil ▴ 310

@heike-pospisil-1097

Last seen 9.6 years ago

Hello Bioconductors, I am looking for a method to perfom over representation analysis (Gene Ontology) within R. I have data from the Maize Oligonucleotide Array (two channel) with the GO categories for all probes on this array. I have clustered the genes using Maanova and I am interested in GO over representation of the gene lists from these clusters. I know the GO tools from Bioconductor (e.g. GOstats), but I do not know how to adapt the analysis to an 'unusual' array with no annotation data package and now Entrez IDs. Any hints? Thanks in advance, Heike -- Dr. Heike Pospisil | pospisil at zbh.uni-hamburg.de University of Hamburg | Center for Bioinformatics Bundesstrasse 43 | 20146 Hamburg, Germany phone:+49-40-42838-7303 | fax: +49-40-42838-7312

Annotation GO maanova Annotation GO maanova • 1.7k views

ADD COMMENT • link updated 15.7 years ago by Marc Carlson ★ 7.2k • written 15.7 years ago by Heike Pospisil ▴ 310

0

Entering edit mode

Marc Carlson ★ 7.2k

@marc-carlson-2264

Last seen 7.7 years ago

United States

You could just make an annotation package for the array in question by using the SQLForge code in the AnotationDbi package. You can find instructions on how to do this here: http://www.bioconductor.org/packages/2.3/bioc/html/AnnotationDbi.html Let me know if you have any questions about SQLForge. Marc Heike Pospisil wrote: > Hello Bioconductors, > > I am looking for a method to perfom over representation analysis (Gene > Ontology) within R. I have data from the Maize Oligonucleotide Array (two > channel) with the GO categories for all probes on this array. I have > clustered the genes using Maanova and I am interested in GO over > representation of the gene lists from these clusters. > > I know the GO tools from Bioconductor (e.g. GOstats), but I do not know how to > adapt the analysis to an 'unusual' array with no annotation data package and > now Entrez IDs. Any hints? > > Thanks in advance, > Heike >

ADD COMMENT • link 15.7 years ago Marc Carlson ★ 7.2k

0

Entering edit mode

Dear All I think one thing that is frustrating here is that there is not a simple guide here for people who want to create an annotation package for an array that does not yet have one. Do we use AnnotationDbi? Or AnnBuilder? Or is there another way? What is the "best practice" for building an annotation package? Thanks Mick -----Original Message----- From: bioconductor-bounces@stat.math.ethz.ch [mailto:bioconductor-bounces at stat.math.ethz.ch] On Behalf Of Marc Carlson Sent: 25 August 2008 17:03 To: Heike Pospisil Cc: bioconductor at stat.math.ethz.ch Subject: Re: [BioC] GO over representation analysis You could just make an annotation package for the array in question by using the SQLForge code in the AnotationDbi package. You can find instructions on how to do this here: http://www.bioconductor.org/packages/2.3/bioc/html/AnnotationDbi.html Let me know if you have any questions about SQLForge. Marc Heike Pospisil wrote: > Hello Bioconductors, > > I am looking for a method to perfom over representation analysis (Gene > Ontology) within R. I have data from the Maize Oligonucleotide Array (two > channel) with the GO categories for all probes on this array. I have > clustered the genes using Maanova and I am interested in GO over > representation of the gene lists from these clusters. > > I know the GO tools from Bioconductor (e.g. GOstats), but I do not know how to > adapt the analysis to an 'unusual' array with no annotation data package and > now Entrez IDs. Any hints? > > Thanks in advance, > Heike > _______________________________________________ Bioconductor mailing list Bioconductor at stat.math.ethz.ch https://stat.ethz.ch/mailman/listinfo/bioconductor Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor

ADD REPLY • link 15.7 years ago michael watson IAH-C ★ 3.4k

0

Entering edit mode

On Fri, Aug 29, 2008 at 8:33 AM, michael watson (IAH-C) <michael.watson at="" bbsrc.ac.uk=""> wrote: > Dear All > > I think one thing that is frustrating here is that there is not a simple > guide here for people who want to create an annotation package for an > array that does not yet have one. > > Do we use AnnotationDbi? Or AnnBuilder? Or is there another way? > > What is the "best practice" for building an annotation package? Hi, Mick. The confusion arises because annotation packages have been migrated from the environment-based packages built by AnnBuilder to the newer SQLite-based packages of AnnotationDbi. The answer depends on which version of R and, therefore, which version of Bioconductor you are using. That said, the standard for the current and future releases (for the near-future, anyway) is to use SQLForge from AnnotationDbi. Sean > -----Original Message----- > From: bioconductor-bounces at stat.math.ethz.ch > [mailto:bioconductor-bounces at stat.math.ethz.ch] On Behalf Of Marc > Carlson > Sent: 25 August 2008 17:03 > To: Heike Pospisil > Cc: bioconductor at stat.math.ethz.ch > Subject: Re: [BioC] GO over representation analysis > > You could just make an annotation package for the array in question by > using the SQLForge code in the AnotationDbi package. > You can find instructions on how to do this here: > > http://www.bioconductor.org/packages/2.3/bioc/html/AnnotationDbi.html > > Let me know if you have any questions about SQLForge. > > > Marc > > > > Heike Pospisil wrote: >> Hello Bioconductors, >> >> I am looking for a method to perfom over representation analysis (Gene > >> Ontology) within R. I have data from the Maize Oligonucleotide Array > (two >> channel) with the GO categories for all probes on this array. I have >> clustered the genes using Maanova and I am interested in GO over >> representation of the gene lists from these clusters. >> >> I know the GO tools from Bioconductor (e.g. GOstats), but I do not > know how to >> adapt the analysis to an 'unusual' array with no annotation data > package and >> now Entrez IDs. Any hints? >> >> Thanks in advance, >> Heike >> > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor >

ADD REPLY • link 15.7 years ago Sean Davis 21k

0

Entering edit mode

There is one slight additional wrinkle. AnnotationDbi currently supports fewer species than AnnBuilder. Building a package using AnnotationDbi is (at least) a two step process, of which only one is required of the end user. The first step(s) are to build a database containing all relevant data for a given species that is then used to populate the chip- specific package databases. If you are interested in an annotation package that is not yet supported by AnnotationDbi, then you will need to consult with Marc Carlson to get the primary database built. Best, Jim Sean Davis wrote: > On Fri, Aug 29, 2008 at 8:33 AM, michael watson (IAH-C) > <michael.watson at="" bbsrc.ac.uk=""> wrote: >> Dear All >> >> I think one thing that is frustrating here is that there is not a simple >> guide here for people who want to create an annotation package for an >> array that does not yet have one. >> >> Do we use AnnotationDbi? Or AnnBuilder? Or is there another way? >> >> What is the "best practice" for building an annotation package? > > Hi, Mick. The confusion arises because annotation packages have been > migrated from the environment-based packages built by AnnBuilder to > the newer SQLite-based packages of AnnotationDbi. The answer depends > on which version of R and, therefore, which version of Bioconductor > you are using. That said, the standard for the current and future > releases (for the near-future, anyway) is to use SQLForge from > AnnotationDbi. > > Sean > >> -----Original Message----- >> From: bioconductor-bounces at stat.math.ethz.ch >> [mailto:bioconductor-bounces at stat.math.ethz.ch] On Behalf Of Marc >> Carlson >> Sent: 25 August 2008 17:03 >> To: Heike Pospisil >> Cc: bioconductor at stat.math.ethz.ch >> Subject: Re: [BioC] GO over representation analysis >> >> You could just make an annotation package for the array in question by >> using the SQLForge code in the AnotationDbi package. >> You can find instructions on how to do this here: >> >> http://www.bioconductor.org/packages/2.3/bioc/html/AnnotationDbi.html >> >> Let me know if you have any questions about SQLForge. >> >> >> Marc >> >> >> >> Heike Pospisil wrote: >>> Hello Bioconductors, >>> >>> I am looking for a method to perfom over representation analysis (Gene >>> Ontology) within R. I have data from the Maize Oligonucleotide Array >> (two >>> channel) with the GO categories for all probes on this array. I have >>> clustered the genes using Maanova and I am interested in GO over >>> representation of the gene lists from these clusters. >>> >>> I know the GO tools from Bioconductor (e.g. GOstats), but I do not >> know how to >>> adapt the analysis to an 'unusual' array with no annotation data >> package and >>> now Entrez IDs. Any hints? >>> >>> Thanks in advance, >>> Heike >>> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at stat.math.ethz.ch >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: >> http://news.gmane.org/gmane.science.biology.informatics.conductor >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at stat.math.ethz.ch >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor >> > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor -- James W. MacDonald, M.S. Biostatistician Hildebrandt Lab 8220D MSRB III 1150 W. Medical Center Drive Ann Arbor MI 48109-0646 734-936-8662

ADD REPLY • link 15.7 years ago James W. MacDonald 65k

0

Entering edit mode

OK, so there is not chicken.db0 or cow.db0 package so that rules that one out. My experience is that most people have a "star" schema of information linking probes to genes to GO IDs, Pathways, Entrez etc etc. It *should* be simple to make an annotation package out of this, but it isn't. And I hate to be hper-critical but the Vignette for AnnBuilder is virtually inaccessible... -----Original Message----- From: James W. MacDonald [mailto:jmacdon@med.umich.edu] Sent: 29 August 2008 14:07 To: Sean Davis Cc: michael watson (IAH-C); bioconductor at stat.math.ethz.ch Subject: Re: [BioC] GO over representation analysis There is one slight additional wrinkle. AnnotationDbi currently supports fewer species than AnnBuilder. Building a package using AnnotationDbi is (at least) a two step process, of which only one is required of the end user. The first step(s) are to build a database containing all relevant data for a given species that is then used to populate the chip- specific package databases. If you are interested in an annotation package that is not yet supported by AnnotationDbi, then you will need to consult with Marc Carlson to get the primary database built. Best, Jim Sean Davis wrote: > On Fri, Aug 29, 2008 at 8:33 AM, michael watson (IAH-C) > <michael.watson at="" bbsrc.ac.uk=""> wrote: >> Dear All >> >> I think one thing that is frustrating here is that there is not a simple >> guide here for people who want to create an annotation package for an >> array that does not yet have one. >> >> Do we use AnnotationDbi? Or AnnBuilder? Or is there another way? >> >> What is the "best practice" for building an annotation package? > > Hi, Mick. The confusion arises because annotation packages have been > migrated from the environment-based packages built by AnnBuilder to > the newer SQLite-based packages of AnnotationDbi. The answer depends > on which version of R and, therefore, which version of Bioconductor > you are using. That said, the standard for the current and future > releases (for the near-future, anyway) is to use SQLForge from > AnnotationDbi. > > Sean > >> -----Original Message----- >> From: bioconductor-bounces at stat.math.ethz.ch >> [mailto:bioconductor-bounces at stat.math.ethz.ch] On Behalf Of Marc >> Carlson >> Sent: 25 August 2008 17:03 >> To: Heike Pospisil >> Cc: bioconductor at stat.math.ethz.ch >> Subject: Re: [BioC] GO over representation analysis >> >> You could just make an annotation package for the array in question by >> using the SQLForge code in the AnotationDbi package. >> You can find instructions on how to do this here: >> >> http://www.bioconductor.org/packages/2.3/bioc/html/AnnotationDbi.html >> >> Let me know if you have any questions about SQLForge. >> >> >> Marc >> >> >> >> Heike Pospisil wrote: >>> Hello Bioconductors, >>> >>> I am looking for a method to perfom over representation analysis (Gene >>> Ontology) within R. I have data from the Maize Oligonucleotide Array >> (two >>> channel) with the GO categories for all probes on this array. I have >>> clustered the genes using Maanova and I am interested in GO over >>> representation of the gene lists from these clusters. >>> >>> I know the GO tools from Bioconductor (e.g. GOstats), but I do not >> know how to >>> adapt the analysis to an 'unusual' array with no annotation data >> package and >>> now Entrez IDs. Any hints? >>> >>> Thanks in advance, >>> Heike >>> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at stat.math.ethz.ch >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: >> http://news.gmane.org/gmane.science.biology.informatics.conductor >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at stat.math.ethz.ch >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor >> > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor -- James W. MacDonald, M.S. Biostatistician Hildebrandt Lab 8220D MSRB III 1150 W. Medical Center Drive Ann Arbor MI 48109-0646 734-936-8662

ADD REPLY • link 15.7 years ago michael watson IAH-C ★ 3.4k

0

Entering edit mode

michael watson (IAH-C) wrote: > OK, so there is not chicken.db0 or cow.db0 package so that rules that > one out. Really? Does this mean you contacted Marc Carlson about creating packages for these species and he told you it was out of the question? I would find that rather surprising. In the past (and a quick search of the BioC listserv bears this out), he has been rather accommodating. > > My experience is that most people have a "star" schema of information > linking probes to genes to GO IDs, Pathways, Entrez etc etc. > > It *should* be simple to make an annotation package out of this, but it > isn't. > > And I hate to be hper-critical but the Vignette for AnnBuilder is > virtually inaccessible... Yes, well it is a bit much to ask that the vignette for a soon to be deprecated package be updated, no? > > -----Original Message----- > From: James W. MacDonald [mailto:jmacdon at med.umich.edu] > Sent: 29 August 2008 14:07 > To: Sean Davis > Cc: michael watson (IAH-C); bioconductor at stat.math.ethz.ch > Subject: Re: [BioC] GO over representation analysis > > There is one slight additional wrinkle. AnnotationDbi currently supports > > fewer species than AnnBuilder. Building a package using AnnotationDbi is > > (at least) a two step process, of which only one is required of the end > user. The first step(s) are to build a database containing all relevant > data for a given species that is then used to populate the chip- specific > > package databases. > > If you are interested in an annotation package that is not yet supported > > by AnnotationDbi, then you will need to consult with Marc Carlson to get > > the primary database built. > > Best, > > Jim > > > > Sean Davis wrote: >> On Fri, Aug 29, 2008 at 8:33 AM, michael watson (IAH-C) >> <michael.watson at="" bbsrc.ac.uk=""> wrote: >>> Dear All >>> >>> I think one thing that is frustrating here is that there is not a > simple >>> guide here for people who want to create an annotation package for an >>> array that does not yet have one. >>> >>> Do we use AnnotationDbi? Or AnnBuilder? Or is there another way? >>> >>> What is the "best practice" for building an annotation package? >> Hi, Mick. The confusion arises because annotation packages have been >> migrated from the environment-based packages built by AnnBuilder to >> the newer SQLite-based packages of AnnotationDbi. The answer depends >> on which version of R and, therefore, which version of Bioconductor >> you are using. That said, the standard for the current and future >> releases (for the near-future, anyway) is to use SQLForge from >> AnnotationDbi. >> >> Sean >> >>> -----Original Message----- >>> From: bioconductor-bounces at stat.math.ethz.ch >>> [mailto:bioconductor-bounces at stat.math.ethz.ch] On Behalf Of Marc >>> Carlson >>> Sent: 25 August 2008 17:03 >>> To: Heike Pospisil >>> Cc: bioconductor at stat.math.ethz.ch >>> Subject: Re: [BioC] GO over representation analysis >>> >>> You could just make an annotation package for the array in question > by >>> using the SQLForge code in the AnotationDbi package. >>> You can find instructions on how to do this here: >>> >>> http://www.bioconductor.org/packages/2.3/bioc/html/AnnotationDbi.html >>> >>> Let me know if you have any questions about SQLForge. >>> >>> >>> Marc >>> >>> >>> >>> Heike Pospisil wrote: >>>> Hello Bioconductors, >>>> >>>> I am looking for a method to perfom over representation analysis > (Gene >>>> Ontology) within R. I have data from the Maize Oligonucleotide Array >>> (two >>>> channel) with the GO categories for all probes on this array. I have >>>> clustered the genes using Maanova and I am interested in GO over >>>> representation of the gene lists from these clusters. >>>> >>>> I know the GO tools from Bioconductor (e.g. GOstats), but I do not >>> know how to >>>> adapt the analysis to an 'unusual' array with no annotation data >>> package and >>>> now Entrez IDs. Any hints? >>>> >>>> Thanks in advance, >>>> Heike >>>> >>> _______________________________________________ >>> Bioconductor mailing list >>> Bioconductor at stat.math.ethz.ch >>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>> Search the archives: >>> http://news.gmane.org/gmane.science.biology.informatics.conductor >>> >>> _______________________________________________ >>> Bioconductor mailing list >>> Bioconductor at stat.math.ethz.ch >>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>> Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at stat.math.ethz.ch >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > -- James W. MacDonald, M.S. Biostatistician Hildebrandt Lab 8220D MSRB III 1150 W. Medical Center Drive Ann Arbor MI 48109-0646 734-936-8662

ADD REPLY • link 15.7 years ago James W. MacDonald 65k

0

Entering edit mode

I am sure Marc would be very accomodating :) What would be excellent, however, is if I could create a db0 package myself. I have all the data - all the links from ensembl, entrez, GO, KEGG, Unigene etc. Is it planned to provide tools to create db0 packages? -----Original Message----- From: James W. MacDonald [mailto:jmacdon@med.umich.edu] Sent: 29 August 2008 15:50 To: michael watson (IAH-C) Cc: Sean Davis; bioconductor at stat.math.ethz.ch Subject: Re: [BioC] GO over representation analysis michael watson (IAH-C) wrote: > OK, so there is not chicken.db0 or cow.db0 package so that rules that > one out. Really? Does this mean you contacted Marc Carlson about creating packages for these species and he told you it was out of the question? I would find that rather surprising. In the past (and a quick search of the BioC listserv bears this out), he has been rather accommodating. > > My experience is that most people have a "star" schema of information > linking probes to genes to GO IDs, Pathways, Entrez etc etc. > > It *should* be simple to make an annotation package out of this, but it > isn't. > > And I hate to be hper-critical but the Vignette for AnnBuilder is > virtually inaccessible... Yes, well it is a bit much to ask that the vignette for a soon to be deprecated package be updated, no? > > -----Original Message----- > From: James W. MacDonald [mailto:jmacdon at med.umich.edu] > Sent: 29 August 2008 14:07 > To: Sean Davis > Cc: michael watson (IAH-C); bioconductor at stat.math.ethz.ch > Subject: Re: [BioC] GO over representation analysis > > There is one slight additional wrinkle. AnnotationDbi currently supports > > fewer species than AnnBuilder. Building a package using AnnotationDbi is > > (at least) a two step process, of which only one is required of the end > user. The first step(s) are to build a database containing all relevant > data for a given species that is then used to populate the chip-specific > > package databases. > > If you are interested in an annotation package that is not yet supported > > by AnnotationDbi, then you will need to consult with Marc Carlson to get > > the primary database built. > > Best, > > Jim > > > > Sean Davis wrote: >> On Fri, Aug 29, 2008 at 8:33 AM, michael watson (IAH-C) >> <michael.watson at="" bbsrc.ac.uk=""> wrote: >>> Dear All >>> >>> I think one thing that is frustrating here is that there is not a > simple >>> guide here for people who want to create an annotation package for an >>> array that does not yet have one. >>> >>> Do we use AnnotationDbi? Or AnnBuilder? Or is there another way? >>> >>> What is the "best practice" for building an annotation package? >> Hi, Mick. The confusion arises because annotation packages have been >> migrated from the environment-based packages built by AnnBuilder to >> the newer SQLite-based packages of AnnotationDbi. The answer depends >> on which version of R and, therefore, which version of Bioconductor >> you are using. That said, the standard for the current and future >> releases (for the near-future, anyway) is to use SQLForge from >> AnnotationDbi. >> >> Sean >> >>> -----Original Message----- >>> From: bioconductor-bounces at stat.math.ethz.ch >>> [mailto:bioconductor-bounces at stat.math.ethz.ch] On Behalf Of Marc >>> Carlson >>> Sent: 25 August 2008 17:03 >>> To: Heike Pospisil >>> Cc: bioconductor at stat.math.ethz.ch >>> Subject: Re: [BioC] GO over representation analysis >>> >>> You could just make an annotation package for the array in question > by >>> using the SQLForge code in the AnotationDbi package. >>> You can find instructions on how to do this here: >>> >>> http://www.bioconductor.org/packages/2.3/bioc/html/AnnotationDbi.html >>> >>> Let me know if you have any questions about SQLForge. >>> >>> >>> Marc >>> >>> >>> >>> Heike Pospisil wrote: >>>> Hello Bioconductors, >>>> >>>> I am looking for a method to perfom over representation analysis > (Gene >>>> Ontology) within R. I have data from the Maize Oligonucleotide Array >>> (two >>>> channel) with the GO categories for all probes on this array. I have >>>> clustered the genes using Maanova and I am interested in GO over >>>> representation of the gene lists from these clusters. >>>> >>>> I know the GO tools from Bioconductor (e.g. GOstats), but I do not >>> know how to >>>> adapt the analysis to an 'unusual' array with no annotation data >>> package and >>>> now Entrez IDs. Any hints? >>>> >>>> Thanks in advance, >>>> Heike >>>> >>> _______________________________________________ >>> Bioconductor mailing list >>> Bioconductor at stat.math.ethz.ch >>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>> Search the archives: >>> http://news.gmane.org/gmane.science.biology.informatics.conductor >>> >>> _______________________________________________ >>> Bioconductor mailing list >>> Bioconductor at stat.math.ethz.ch >>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>> Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at stat.math.ethz.ch >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > -- James W. MacDonald, M.S. Biostatistician Hildebrandt Lab 8220D MSRB III 1150 W. Medical Center Drive Ann Arbor MI 48109-0646 734-936-8662

ADD REPLY • link 15.7 years ago michael watson IAH-C ★ 3.4k

0

Entering edit mode

michael watson (IAH-C) wrote: > I am sure Marc would be very accomodating :) > > What would be excellent, however, is if I could create a db0 package > myself. > I have all the data - all the links from ensembl, entrez, GO, KEGG, > Unigene etc. > > Is it planned to provide tools to create db0 packages? My understanding is that this is part of the plan. Marc would know better, and will also have a better idea of the timeline for such tools to appear. > > -----Original Message----- > From: James W. MacDonald [mailto:jmacdon at med.umich.edu] > Sent: 29 August 2008 15:50 > To: michael watson (IAH-C) > Cc: Sean Davis; bioconductor at stat.math.ethz.ch > Subject: Re: [BioC] GO over representation analysis > > michael watson (IAH-C) wrote: >> OK, so there is not chicken.db0 or cow.db0 package so that rules that >> one out. > > Really? Does this mean you contacted Marc Carlson about creating > packages for these species and he told you it was out of the question? I > > would find that rather surprising. In the past (and a quick search of > the BioC listserv bears this out), he has been rather accommodating. > >> My experience is that most people have a "star" schema of information >> linking probes to genes to GO IDs, Pathways, Entrez etc etc. >> >> It *should* be simple to make an annotation package out of this, but > it >> isn't. >> >> And I hate to be hper-critical but the Vignette for AnnBuilder is >> virtually inaccessible... > > Yes, well it is a bit much to ask that the vignette for a soon to be > deprecated package be updated, no? > >> -----Original Message----- >> From: James W. MacDonald [mailto:jmacdon at med.umich.edu] >> Sent: 29 August 2008 14:07 >> To: Sean Davis >> Cc: michael watson (IAH-C); bioconductor at stat.math.ethz.ch >> Subject: Re: [BioC] GO over representation analysis >> >> There is one slight additional wrinkle. AnnotationDbi currently > supports >> fewer species than AnnBuilder. Building a package using AnnotationDbi > is >> (at least) a two step process, of which only one is required of the > end >> user. The first step(s) are to build a database containing all > relevant >> data for a given species that is then used to populate the > chip-specific >> package databases. >> >> If you are interested in an annotation package that is not yet > supported >> by AnnotationDbi, then you will need to consult with Marc Carlson to > get >> the primary database built. >> >> Best, >> >> Jim >> >> >> >> Sean Davis wrote: >>> On Fri, Aug 29, 2008 at 8:33 AM, michael watson (IAH-C) >>> <michael.watson at="" bbsrc.ac.uk=""> wrote: >>>> Dear All >>>> >>>> I think one thing that is frustrating here is that there is not a >> simple >>>> guide here for people who want to create an annotation package for > an >>>> array that does not yet have one. >>>> >>>> Do we use AnnotationDbi? Or AnnBuilder? Or is there another way? >>>> >>>> What is the "best practice" for building an annotation package? >>> Hi, Mick. The confusion arises because annotation packages have been >>> migrated from the environment-based packages built by AnnBuilder to >>> the newer SQLite-based packages of AnnotationDbi. The answer depends >>> on which version of R and, therefore, which version of Bioconductor >>> you are using. That said, the standard for the current and future >>> releases (for the near-future, anyway) is to use SQLForge from >>> AnnotationDbi. >>> >>> Sean >>> >>>> -----Original Message----- >>>> From: bioconductor-bounces at stat.math.ethz.ch >>>> [mailto:bioconductor-bounces at stat.math.ethz.ch] On Behalf Of Marc >>>> Carlson >>>> Sent: 25 August 2008 17:03 >>>> To: Heike Pospisil >>>> Cc: bioconductor at stat.math.ethz.ch >>>> Subject: Re: [BioC] GO over representation analysis >>>> >>>> You could just make an annotation package for the array in question >> by >>>> using the SQLForge code in the AnotationDbi package. >>>> You can find instructions on how to do this here: >>>> >>>> > http://www.bioconductor.org/packages/2.3/bioc/html/AnnotationDbi.html >>>> Let me know if you have any questions about SQLForge. >>>> >>>> >>>> Marc >>>> >>>> >>>> >>>> Heike Pospisil wrote: >>>>> Hello Bioconductors, >>>>> >>>>> I am looking for a method to perfom over representation analysis >> (Gene >>>>> Ontology) within R. I have data from the Maize Oligonucleotide > Array >>>> (two >>>>> channel) with the GO categories for all probes on this array. I > have >>>>> clustered the genes using Maanova and I am interested in GO over >>>>> representation of the gene lists from these clusters. >>>>> >>>>> I know the GO tools from Bioconductor (e.g. GOstats), but I do not >>>> know how to >>>>> adapt the analysis to an 'unusual' array with no annotation data >>>> package and >>>>> now Entrez IDs. Any hints? >>>>> >>>>> Thanks in advance, >>>>> Heike >>>>> >>>> _______________________________________________ >>>> Bioconductor mailing list >>>> Bioconductor at stat.math.ethz.ch >>>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>>> Search the archives: >>>> http://news.gmane.org/gmane.science.biology.informatics.conductor >>>> >>>> _______________________________________________ >>>> Bioconductor mailing list >>>> Bioconductor at stat.math.ethz.ch >>>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>>> Search the archives: >> http://news.gmane.org/gmane.science.biology.informatics.conductor >>> _______________________________________________ >>> Bioconductor mailing list >>> Bioconductor at stat.math.ethz.ch >>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>> Search the archives: >> http://news.gmane.org/gmane.science.biology.informatics.conductor >> > -- James W. MacDonald, M.S. Biostatistician Hildebrandt Lab 8220D MSRB III 1150 W. Medical Center Drive Ann Arbor MI 48109-0646 734-936-8662

ADD REPLY • link 15.7 years ago James W. MacDonald 65k

0

Entering edit mode

Joern Toedling ▴ 730

@joern-toedling-1244

Last seen 9.6 years ago

Hello, I was quite happy with the package topGO, which allows you to supply the mapping between gene-identifiers and GO term IDs as a list. Such a list can be easily created using the biomaRt package to query the Ensembl database, or the list can be created from other annotation sources. The topGO vignette and some previous posts in the archive of the Bioconductor list, such as the one below, may provide additional clues about how to perform this analysis with topGO. http://thread.gmane.org/gmane.science.biology.informatics.conductor/18 985/focus=19016 Regards, Joern Heike Pospisil wrote: > Hello Bioconductors, > > I am looking for a method to perfom over representation analysis (Gene > Ontology) within R. I have data from the Maize Oligonucleotide Array (two > channel) with the GO categories for all probes on this array. I have > clustered the genes using Maanova and I am interested in GO over > representation of the gene lists from these clusters. > > I know the GO tools from Bioconductor (e.g. GOstats), but I do not know how to > adapt the analysis to an 'unusual' array with no annotation data package and > now Entrez IDs. Any hints? > > Thanks in advance, > Heike >

ADD COMMENT • link 15.7 years ago Joern Toedling ▴ 730

Login before adding your answer.