"Di Wu" <di.wu at="" med.monash.edu.au=""> writes:
> Thank you, Martin.
> That's what I need. I have a follow-up basic question.
> How can I transform "collectionType"? to? character, such as "C2",
in case I
> only want to play with the sets from C2.
I'm not quite sure what you're asking, but something like this
> is_c2 <- sapply(gss, function(gs)
bcCategory(collectionType(gs))=="c2")
gives you a logical vector which is TRUE when the bcCategory of the
collectionType of each gene set in gss is "c2". You can then
> c2sets <- gss[is_c2]
to get just those gene sets belonging to c2 (I'm using hints from the
display of the gene set to guess at how to get parts of it out, e.g.,
> gss[[1]]
[...]
collectionType: Broad
bcCategory: c1 (Positional)
bcSubCategory: NA
details: use 'details(object)'
suggests that I can use collectionType on gss[[1]], and bcCategory on
the result of collectionType; I could also look in the help page,
e.g., for GeneSet-class and BroadCollection-class).
Also maybe worth pointing out that gene set collections can be subset
by their set names, e.g.,
> details(gss[["KENNY_WNT_UP"]])
setName: KENNY_WNT_UP
geneIds: CUGBP2, ARFGEF2, ..., CASKIN2 (total: 51)
geneIdType: Symbol
collectionType: Broad
bcCategory: c2 (Curated)
bcSubCategory: NA
setIdentifier: c2:803
description: Genes up-regulated by Wnt in HC11 (mammary epithelial
cells)
(longDescription available)
organism: Mouse
pubMedIds: 15642117
urls: file://home/mtmorgan/tmp/msigdb_v2.1.xml
contributor: Yujin Hoshida
setVersion: 0.0.1
creationDate: Tue Jul 15 20:31:53 2008
Hope that's on the right track for what you were looking for,
Martin
> Cheers,
> Di
>
>
> On Wed, Jul 16, 2008 at 12:49 PM, Martin Morgan <[[mtmorgan at
fhcrc.org]]>
> wrote:
>
> Hi Di --
>
> "Di Wu" <[[di.wu at med.monash.edu.au]]> writes:
> > Dear list, > > I am trying to use MsigDB, the gene
set
> database from GSEA. I am interested > to know whether the
sets
> of genes are from human or mouse, particularly in > C2.
> I
> know I can always click the web and go deep to see how a set
was
> obtained. > But is there any coding way to get the species
> sources for all the gene sets > in C2 or MsigDB.
>
>
>
> If you're using the GSEABase package, then each gene set read
by getBroadSets
> records the organism, so for example
> > fl <- "/path/to/msigdb_v2.1.xml" > gss <-
getBroadSets(fl) #
> read entire msigdb > organism(gss[[1]]) "Human" >
> table(sapply(gss, organism))
> ? ? ? ? Chimpanzee ? ? ? ? ? ? Generic ? ? ? ? ? ? ? Human
? ? ?
> ? ? ? ? ? ?1 ? ? ? ? ? ? ? ? 456 ? ? ? ? ? ? ? ?1769
Human,Mouse,Rat,Dog
> ? ? ? ? ? ? ? Mouse ? ? ? ? ? ? ? ? Pig ? ? ? ? ? ? ? ?837
? ? ?
> ? ? ? ? ? 248 ? ? ? ? ? ? ? ? ?11 ? ? ? ? ? ? ? ?Rat ? ? ?
? ? ?
> ?Rhesus ? ? ? ? ?Zebra Fish ? ? ? ? ? ? ? ? ?3 ? ? ? ? ? ?
? ? ?
> 4 ? ? ? ? ? ? ? ? ? 8
> > # retrieve a few sets from the web > gss <-
> getBroadSets(asBroadUri(c('chr16q', 'GNF2_ZAP70'))) >
> organism(gss[[1]]) "Human"
> As a 'closer to the metal' alternative, you could use the XML
> package
> > xml <- xmlTreeParse(fl, useInternal=TRUE) > query <-
> //GENESET[@STANDARD_NAME="KENNY_WNT_UP"]/@ORGANISM' >
> xpathApply(xml, query, xmlValue) [[1]] [1] "Mouse"
>
> table(unlist(xpathApply(xml, "//@ORGANISM", xmlValue)))
> ? ? ? ? Chimpanzee ? ? ? ? ? ? Generic ? ? ? ? ? ? ? Human
? ? ?
> ? ? ? ? ? ?1 ? ? ? ? ? ? ? ? 456 ? ? ? ? ? ? ? ?1769
Human,Mouse,Rat,Dog
> ? ? ? ? ? ? ? Mouse ? ? ? ? ? ? ? ? Pig ? ? ? ? ? ? ? ?837
? ? ?
> ? ? ? ? ? 248 ? ? ? ? ? ? ? ? ?11 ? ? ? ? ? ? ? ?Rat ? ? ?
? ? ?
> ?Rhesus ? ? ? ? ?Zebra Fish ? ? ? ? ? ? ? ? ?3 ? ? ? ? ? ?
? ? ?
> 4 ? ? ? ? ? ? ? ? ? 8
> Martin
>
> > Appreciate your suggestions. > Cheers, > Di
>
>
>
> > ? ? ? [[alternative HTML version deleted]] > >
> _______________________________________________ >
Bioconductor
> mailing list > [[Bioconductor at stat.math.ethz.ch]] >
> [[
https://stat.ethz.ch/mailman/listinfo/bioconductor]] >
Search
> the archives:
>
[[
http://news.gmane.org/gmane.science.biology.informatics.conductor]]
> -- Martin Morgan Computational Biology / Fred
Hutchinson
> Cancer Research Center 1100 Fairview Ave. N. PO Box
19024
> Seattle, WA 98109
> Location: Arnold Building M2 B169 Phone: (206) 667-2793
>
>
>
--
Martin Morgan
Computational Biology / Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N.
PO Box 19024 Seattle, WA 98109
Location: Arnold Building M2 B169
Phone: (206) 667-2793