GSEA to discover co-regulated genes
1
0
Entering edit mode
Paul Geeleher ★ 1.3k
@paul-geeleher-2679
Last seen 10.3 years ago
Hi All, I've been following the instructions here: http://www.bioconductor.org/workshops/2007/seattle_bioc_intro_nov_07/f older.2007-11-30.5595085375/ to find dysregulated kegg pathways in a dataset. What I'm now wondering is if I can use the same methodology to find co-regulated genes / genes with common transcription factors? I'd assume its simply of redefining the gene set gsc <- GeneSetCollection(eset, setType = KEGGCollection()) to gsc <- GeneSetCollection(eset, setType = CoRegulatedGenesOrSomeFunctionLikeThat()) I suppose what I'm asking is if such a gene set exists in Bioconductor? And if not can this be done somewhere else? Thanks. -- Paul Geeleher Department of Mathematics National University of Ireland Galway Ireland
Transcription Pathways Transcription Pathways • 1.1k views
ADD COMMENT
0
Entering edit mode
@vincent-j-carey-jr-4
Last seen 3 months ago
United States
On Wed, Jan 21, 2009 at 6:56 AM, Paul Geeleher <paulgeeleher@gmail.com>wrote: > Hi All, > > I've been following the instructions here: > > > http://www.bioconductor.org/workshops/2007/seattle_bioc_intro_nov_07 /folder.2007-11-30.5595085375/ > > to find dysregulated kegg pathways in a dataset. What I'm now > wondering is if I can use the same methodology to find co-regulated > genes / genes with common transcription factors? > > I'd assume its simply of redefining the gene set > > gsc <- GeneSetCollection(eset, setType = KEGGCollection()) > to > gsc <- GeneSetCollection(eset, setType = > CoRegulatedGenesOrSomeFunctionLikeThat()) > > > I suppose what I'm asking is if such a gene set exists in > Bioconductor? And if not can this be done somewhere else? > GSEABase has infrastructure to import the Broad MSIGDB from its XML serialization; see http://www.broad.mit.edu/gsea/downloads.jsp, where you will need to register. If you use getBroadSets() in GSEABase to import the entire MSIGDB you will have access to 5452 gene sets. Broad categorizes these in five groups; group c3 includes motif gene sets which includes a subclass called transcription factor targets. Digging through a GSEABase GeneSetCollection can proceed in various ways. What I will show is probably not the most elegant approach: Assume you have imported the whole MSIGDB as msig2.5 > isC3 = which(sapply(msig2.5, function(x)bcCategory(collectionType(x))) == "c3") > C3coll = msig2.5[isC3] > C3coll GeneSetCollection names: RGAGGAARY_V$PU1_Q6, KRCTCNNNNMANAGC_UNKNOWN, ..., GTTATAT,MIR-410 (837 total) unique identifiers: PCDHGA5, CTXL, ..., pp9099 (15718 total) types in collection: geneIdType: SymbolIdentifier (1 total) collectionType: BroadCollection (1 total) > C3coll[[1]] setName: RGAGGAARY_V$PU1_Q6 geneIds: PCDHGA5, CTXL, ..., HCMOGT-1 (total: 522) geneIdType: Symbol collectionType: Broad bcCategory: c3 (Motif) bcSubCategory: NA details: use 'details(object)' > details(C3coll[[1]]) setName: RGAGGAARY_V$PU1_Q6 geneIds: PCDHGA5, CTXL, ..., HCMOGT-1 (total: 522) geneIdType: Symbol collectionType: Broad bcCategory: c3 (Motif) bcSubCategory: NA setIdentifier: c3:261 description: Genes with promoter regions [-2kb,2kb] around transcription start site containing the motif RGAGGAARY which matches annotation for SPI1: spleen focus forming virus (SFFV) proviral integ ration oncogene spi1 (longDescription available) organism: Human,Mouse,Rat,Dog pubMedIds: urls: msigdb_v2.5.xml contributor: Xiaohui Xie setVersion: 0.0.1 creationDate: Thu Jul 10 16:59:23 2008 invocation of the longDescription method against C3coll[[1]] leads to an interesting structure that will need to be parsed -- seems to be in a marked up medline format. once you have found the gene sets you are interested in, GSEABase contains additional infrastructure to convert the identifiers for genes used in MSIGDB to array probe set identifiers or entrez identifiers, etc. > > Thanks. > > -- > Paul Geeleher > Department of Mathematics > National University of Ireland > Galway > Ireland > > _______________________________________________ > Bioconductor mailing list > Bioconductor@stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > [[alternative HTML version deleted]]
ADD COMMENT

Login before adding your answer.

Traffic: 562 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6