making gene sets for geneSetTest of Limma
2
0
Entering edit mode
@srinivas-iyyer-939
Last seen 9.6 years ago
Dear group, i want to convert all gene sets available from GSEA (C1,C2,C3 and C4) into a 4 different gene sets, so that I can use geneSetTest of limma on above 4 different gene sets. In one of the examples (classic estrogen example, only one set is described). What kind of formats geneSetTest will read for gene sets. ( e.g. every set writting in a single line format OR a list of lists for each gene set). Could anyone suggest some steps to make gene sets for genesettest. thanks Srini
limma convert limma convert • 1.4k views
ADD COMMENT
0
Entering edit mode
@martin-morgan-1513
Last seen 5 days ago
United States
Hi Srini -- Srinivas Iyyer wrote: > Dear group, > > i want to convert all gene sets available from GSEA > (C1,C2,C3 and C4) into a 4 different gene sets, so > that I can use geneSetTest of limma on above 4 > different gene sets. > Hi Srini -- I'm not really sure what you want to do, or if what you want to do makes sense. Here's my best guess at 'getting the job done', but maybe others will give some more advice one whether it's actually a good idea. One possibility is to visit the Broad and download their entire database http://www.broad.mit.edu/gsea/downloads.jsp (look for 'XML database file'). You can then read these into R with > library(GSEABase) > gss = getBroadSets('/path/to/msigdb_v2.5.xml') you might then > collType = lapply(gss, collectionType) > catType = sapply(collType, bcCategory) > table(catType) catType c1 c2 c3 c4 c5 386 1892 837 883 1454 to get the category of each gene set, and > c1sets = gss[catType=="c1"] to select just the c1 sets. > In one of the examples (classic estrogen example, only > one set is described). > I'm not really sure what you are referring to (where is the example?) and I'm not sure that geneSetTest will do what you want. Say you've performed lmFit. You could make a vector of the relevant est statistic, and make sure the vector elements have appropriate names, e.g., converting probes to Symbol identifiers. Say that vector is x. If you were interested in a particular gene set, say all genes in chromosome band 4q27, you could select that > mySet = gss[["chr4q27"]] You might then do something like > geneSetTest(geneIds(mySet), x, ... (where ... might be addition arguments to geneSetTest). If you wanted to test many sets, you might lapply(gss[catType=="c1"], function(aSet) { geneSetTest(geneIds(aSet), x, ...) }) (where again ... are additional arguments to geneSetTest). The key is to get x to have names that are the same as the geneIds in the gene sets. The mapIdentifiers function in GSEABase might help. Hope this helps, Martin > What kind of formats geneSetTest will read for gene > sets. ( e.g. every set writting in a single line > format OR a list of lists for each gene set). > > Could anyone suggest some steps to make gene sets for > genesettest. > > thanks > Srini > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor >
ADD COMMENT
0
Entering edit mode
Dear Martin, Thank you for your email. I should have explained straight and simple. I want to associate my differentially expressed genes ( data from lmfit and ebayes) to 10 different gene expression studies. What i have from those 10 studies is list of genes that are differentially expressed. Ive written them as the following tab-delim text file (.gmt file as per GSEA) Study1 NA G1 G2 G3 G4... Study2 NA G10,G49, G432..... I want to test these gene sets on my lmfit object using geneSetTest(). In the example listed at http://www.bioconductor.org/workshops/2005/BioC2005/labs/lab01/estroge n.html. example; geneSetTest(knownERgenesOnChip,completeTableEst10$t,alternative="both" ) In this particular example, the author is testing the list of genes with sorted t-statistics obtained from lmfit object on 'knownERgenesOnChip' set that has ~20 genes from that chip only. However, In my case I want to test my lmfit object on 10 different sets. How do I make a smilar 'knownERgenesOnChip' gene set that will have all my 10 gene lists as sets. Simply I do not want to run 10 genesetTest's on each gene set. I want to run only once. In the case of GSEA we make a .gmt file with each line as one set. apologies for not writing my question as simple. Thank you . srini --- Martin Morgan <mtmorgan at="" fhcrc.org=""> wrote: > Hi Srini -- > > Srinivas Iyyer wrote: > > Dear group, > > > > i want to convert all gene sets available from > GSEA > > (C1,C2,C3 and C4) into a 4 different gene sets, so > > that I can use geneSetTest of limma on above 4 > > different gene sets. > > > Hi Srini -- I'm not really sure what you want to do, > or if what you want > to do makes sense. Here's my best guess at 'getting > the job done', but > maybe others will give some more advice one whether > it's actually a good > idea. > > One possibility is to visit the Broad and download > their entire database > > http://www.broad.mit.edu/gsea/downloads.jsp > > (look for 'XML database file'). You can then read > these into R with > > > library(GSEABase) > > gss = getBroadSets('/path/to/msigdb_v2.5.xml') > > you might then > > > collType = lapply(gss, collectionType) > > catType = sapply(collType, bcCategory) > > table(catType) > catType > c1 c2 c3 c4 c5 > 386 1892 837 883 1454 > > to get the category of each gene set, and > > > c1sets = gss[catType=="c1"] > > to select just the c1 sets. > > In one of the examples (classic estrogen example, > only > > one set is described). > > > I'm not really sure what you are referring to (where > is the example?) > and I'm not sure that geneSetTest will do what you > want. > > Say you've performed lmFit. You could make a vector > of the relevant est > statistic, and make sure the vector elements have > appropriate names, > e.g., converting probes to Symbol identifiers. Say > that vector is x. > > If you were interested in a particular gene set, say > all genes in > chromosome band 4q27, you could select that > > > mySet = gss[["chr4q27"]] > > You might then do something like > > > geneSetTest(geneIds(mySet), x, ... > > (where ... might be addition arguments to > geneSetTest). If you wanted to > test many sets, you might > > lapply(gss[catType=="c1"], function(aSet) { > geneSetTest(geneIds(aSet), x, ...) > }) > > (where again ... are additional arguments to > geneSetTest). > > The key is to get x to have names that are the same > as the geneIds in > the gene sets. The mapIdentifiers function in > GSEABase might help. > > Hope this helps, > > Martin > > What kind of formats geneSetTest will read for > gene > > sets. ( e.g. every set writting in a single line > > format OR a list of lists for each gene set). > > > > Could anyone suggest some steps to make gene sets > for > > genesettest. > > > > thanks > > Srini > > > > _______________________________________________ > > Bioconductor mailing list > > Bioconductor at stat.math.ethz.ch > > https://stat.ethz.ch/mailman/listinfo/bioconductor > > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > > > >
ADD REPLY
0
Entering edit mode
@gordon-smyth
Last seen just now
WEHI, Melbourne, Australia
Dear Srinivas, You don't seem to have looked at the help document for the geneSetTest() function, and you really do need to read it. That would make it clear that geneSetTest() produces results for only one set at a time. So, if you want results for 10 sets, you have to run geneSetTest() 10 times. There's no alternative. Running geneSetTest() 10 times is still a whole lot quicker than running GSEA once, so it doesn't seem such a problem. Best wishes Gordon ----------- Original Message --------------------------- [BioC] making gene sets for geneSetTest of Limma Srinivas Iyyer srini_iyyer_bio at yahoo.com Wed May 28 05:50:30 CEST 2008 Dear Martin, Thank you for your email. I should have explained straight and simple. I want to associate my differentially expressed genes ( data from lmfit and ebayes) to 10 different gene expression studies. What i have from those 10 studies is list of genes that are differentially expressed. Ive written them as the following tab-delim text file (.gmt file as per GSEA) Study1 NA G1 G2 G3 G4... Study2 NA G10,G49, G432..... I want to test these gene sets on my lmfit object using geneSetTest(). In the example listed at http://www.bioconductor.org/workshops/2005/BioC2005/labs/lab01/estroge n.html. example; geneSetTest(knownERgenesOnChip,completeTableEst10$t,alternative="both" ) In this particular example, the author is testing the list of genes with sorted t-statistics obtained from lmfit object on 'knownERgenesOnChip' set that has ~20 genes from that chip only. However, In my case I want to test my lmfit object on 10 different sets. How do I make a smilar 'knownERgenesOnChip' gene set that will have all my 10 gene lists as sets. Simply I do not want to run 10 genesetTest's on each gene set. I want to run only once. In the case of GSEA we make a .gmt file with each line as one set. apologies for not writing my question as simple. Thank you . srini --- Martin Morgan <mtmorgan at="" fhcrc.org=""> wrote: > Hi Srini -- > > Srinivas Iyyer wrote: > > Dear group, > > > > i want to convert all gene sets available from > GSEA > > (C1,C2,C3 and C4) into a 4 different gene sets, so > > that I can use geneSetTest of limma on above 4 > > different gene sets. > > > Hi Srini -- I'm not really sure what you want to do, > or if what you want > to do makes sense. Here's my best guess at 'getting > the job done', but > maybe others will give some more advice one whether > it's actually a good > idea. > > One possibility is to visit the Broad and download > their entire database > > http://www.broad.mit.edu/gsea/downloads.jsp > > (look for 'XML database file'). You can then read > these into R with > > > library(GSEABase) > > gss = getBroadSets('/path/to/msigdb_v2.5.xml') > > you might then > > > collType = lapply(gss, collectionType) > > catType = sapply(collType, bcCategory) > > table(catType) > catType > c1 c2 c3 c4 c5 > 386 1892 837 883 1454 > > to get the category of each gene set, and > > > c1sets = gss[catType=="c1"] > > to select just the c1 sets. > > In one of the examples (classic estrogen example, > only > > one set is described). > > > I'm not really sure what you are referring to (where > is the example?) > and I'm not sure that geneSetTest will do what you > want. > > Say you've performed lmFit. You could make a vector > of the relevant est > statistic, and make sure the vector elements have > appropriate names, > e.g., converting probes to Symbol identifiers. Say > that vector is x. > > If you were interested in a particular gene set, say > all genes in > chromosome band 4q27, you could select that > > > mySet = gss[["chr4q27"]] > > You might then do something like > > > geneSetTest(geneIds(mySet), x, ... > > (where ... might be addition arguments to > geneSetTest). If you wanted to > test many sets, you might > > lapply(gss[catType=="c1"], function(aSet) { > geneSetTest(geneIds(aSet), x, ...) > }) > > (where again ... are additional arguments to > geneSetTest). > > The key is to get x to have names that are the same > as the geneIds in > the gene sets. The mapIdentifiers function in > GSEABase might help. > > Hope this helps, > > Martin > > What kind of formats geneSetTest will read for > gene > > sets. ( e.g. every set writting in a single line > > format OR a list of lists for each gene set). > > > > Could anyone suggest some steps to make gene sets > for > > genesettest. > > > > thanks > > Srini
ADD COMMENT

Login before adding your answer.

Traffic: 926 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6