GSEA and Broad gene sets

0

Entering edit mode

Brian Hare ▴ 30

@brian-hare-2013

Last seen 11.4 years ago

Are there any detailed instructions available (e.g. vignettes) for how to do GSEA on the Broad collection of pathways in Bioconductor? I see bits and peices - e.g. GSEAbase, geneSetTest(limma) - but haven't seen it all put together - thanks

Pathways GSEABase Pathways GSEABase • 3.0k views

ADD COMMENT • link updated 18.1 years ago by Weiwei Shi ★ 1.2k • written 18.1 years ago by Brian Hare ▴ 30

0

Entering edit mode

Martin Morgan 25k

@martin-morgan-1513

Last seen 11 months ago

United States

Hi Brian -- Depends a bit on what you mean by GSEA. As Duane mentions, there's an R script for one version at the Broad. But there are also at least the PGSEA package, the geneSetTest in limma, and the GlobalAncova package for doing conceptually similar things. And of course in R you can do your own variant. In terms of Broad sets, PGSEA will read .gmt files (see the PGSEA vignette). There's also a function in GSEABase call getBroadSets that, when working (see below), retrieves pre-defined gene sets from the Broad. These sets can then be used in PGSEA, or directly in R. A recent resource for performing GSEA-style analyses in R is from the just-concluded Biocondcutor course (http://bioconductor.org -> Workshops -> 2007 -> Introduction to Bioconductor -> GSEA -> GSEA_Lecture.pdf. The main variant is that at step 3, non-specific filtering, you'll use your Broad collections > library(GSEABase) > fl <- system.file("extdata", "Broad.xml", package="GSEABase") > gsc <- getBroadSets(fl) (not very big collection, just two sets!). If you have identified the particular collections you'd like to work with, then in an ideal world you'd be able to do something like > gsc <- getBroadSets(asBroadUri(c('chr16q', 'GNF2_ZAP70')) to retrieve these sets from the Broad website. Unfortunately, the Broad changed their DB access to not export the XML required by getBroadSets; they are in the process of re-enabling that export service, and getBroadSets will work when that ability is restored. In the mean time, you can visit the Broad site, register, and then download and extract http://www.broad.mit.edu/gsea/resources/files_to_download_locally_on_f irewall_issues.zip You'll then be able to gsc <- getBroadSets("path/to/msigdb_v2.xml") This will get you all the gene sets defined at the Broad; you'll be able to (actually, want to) subset gsc as desired; this might be useful anyway, as you can for instance use grep and lapply to select gene sets based on regular expressions or other criteria (e.g., Broad collection category). The vignette in GSEABase gives some additional information. Martin "Hassane, Duane" <duane_hassane at="" urmc.rochester.edu=""> writes: > The Broad Institute put out a R script, GSEA.R, which works with their gmt files and uses their KS-based enrichment score metric. > > http://www.broad.mit.edu/gsea/software/software_index.html > > Though, I have not yet specifically tried using specific BioC packages with Broad .gmt files. > > Not sure if that's what you're looking for. > > Duane Hassane > > > -----Original Message----- > From: bioconductor-bounces at stat.math.ethz.ch > [mailto:bioconductor-bounces at stat.math.ethz.ch]On Behalf Of > Brian_Hare at vrtx.com > Sent: Wednesday, December 05, 2007 2:34 PM > To: bioconductor at stat.math.ethz.ch > Subject: [BioC] GSEA and Broad gene sets > > > > Are there any detailed instructions available (e.g. vignettes) for how to > do GSEA on > the Broad collection of pathways in Bioconductor? I see bits and peices > - e.g. GSEAbase, > geneSetTest(limma) - but haven't seen it all put together - thanks > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor -- Dr. Martin Morgan, PhD Computational Biology Shared Resource Director Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 Location: Arnold Building M2 B169 Phone: (206) 667-2793

ADD COMMENT • link 18.1 years ago Martin Morgan 25k

0

Entering edit mode

Hassane, Duane ▴ 110

@hassane-duane-1639

Last seen 11.4 years ago

The Broad Institute put out a R script, GSEA.R, which works with their gmt files and uses their KS-based enrichment score metric. http://www.broad.mit.edu/gsea/software/software_index.html Though, I have not yet specifically tried using specific BioC packages with Broad .gmt files. Not sure if that's what you're looking for. Duane Hassane -----Original Message----- From: bioconductor-bounces@stat.math.ethz.ch [mailto:bioconductor-bounces at stat.math.ethz.ch]On Behalf Of Brian_Hare at vrtx.com Sent: Wednesday, December 05, 2007 2:34 PM To: bioconductor at stat.math.ethz.ch Subject: [BioC] GSEA and Broad gene sets Are there any detailed instructions available (e.g. vignettes) for how to do GSEA on the Broad collection of pathways in Bioconductor? I see bits and peices - e.g. GSEAbase, geneSetTest(limma) - but haven't seen it all put together - thanks _______________________________________________ Bioconductor mailing list Bioconductor at stat.math.ethz.ch https://stat.ethz.ch/mailman/listinfo/bioconductor Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor

ADD COMMENT • link 18.1 years ago Hassane, Duane ▴ 110

0

Entering edit mode

Weiwei Shi ★ 1.2k

@weiwei-shi-1407

Last seen 11.4 years ago

An embedded and charset-unspecified text was scrubbed... Name: not available Url: https://stat.ethz.ch/pipermail/bioconductor/attachments/20071205/ e428a68d/attachment.pl

ADD COMMENT • link 18.1 years ago Weiwei Shi ★ 1.2k

0

Entering edit mode

Hi, Weiwei Shi wrote: > Hi, Brian: > I personally recommend you to use its desktop version instead of R version > of GSEA from the Broad. I happened to have some bugs when I used its R > version before. The only thing you need to do is do some input/output > conversion. It would be of more use to most probably everyone to report bugs in an accurate and helpful way, rather than these sort of vague negative comments. If there really are bugs, I am pretty sure the developer would like to know about them best wishes Robert > > Again, just2cent, > > Weiwei > > On Dec 5, 2007 2:33 PM, <brian_hare at="" vrtx.com=""> wrote: > >> Are there any detailed instructions available (e.g. vignettes) for how to >> do GSEA on >> the Broad collection of pathways in Bioconductor? I see bits and peices >> - e.g. GSEAbase, >> geneSetTest(limma) - but haven't seen it all put together - thanks >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at stat.math.ethz.ch >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: >> http://news.gmane.org/gmane.science.biology.informatics.conductor >> > > > -- Robert Gentleman, PhD Program in Computational Biology Division of Public Health Sciences Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N, M2-B876 PO Box 19024 Seattle, Washington 98109-1024 206-667-7700 rgentlem at fhcrc.org

ADD REPLY • link 18.1 years ago rgentleman ★ 5.5k

0

Entering edit mode

An embedded and charset-unspecified text was scrubbed... Name: not available Url: https://stat.ethz.ch/pipermail/bioconductor/attachments/20071206/ b5b2d5e2/attachment.pl

ADD REPLY • link 18.1 years ago Weiwei Shi ★ 1.2k

Login before adding your answer.