topGO documentation / clarification
1
0
Entering edit mode
Mike Dewar ▴ 60
@mike-dewar-4038
Last seen 10.4 years ago
Hi, I've been trying to figure out the usage of topGO for situations where all you have is a list of interesting probesets, along with all the probesets on a microarray. My problems, which have been discussed on the biostar webpage, seem to have been around the specification of geneSel and allGenes. I have ended up not specifying geneSel at all (which would normally be some test to do with a p-value or other score) and to specify allGenes as a named factor, where the probeset is 1 if I consider it interesting, and 0 otherwise. In the code snippet below, `exprset` is an ExpressionSet object and `interesting_genes` is the list of probesets I find interesting. library(topGO) library(mogene10sttranscriptcluster.db) load('exprset') load('interesting_genes') all_genes <- rownames(exprs(exprset)) # then make a factor that is 1 if the probeset is "interesting" and 0 otherwise geneList <- factor(as.integer (all_genes %in% interesting_genes)) # name the factor with the probeset names names (geneList) <- allGenes # form the GOdata object GOdata <-new ("topGOdata", ontology = "BP", allGenes = geneList, nodeSize = 5, # annot, tells topGO to map from GO terms to "genes" annot = annFUN.GO2genes, # so annot then calls something to perform this mapping GO2genes, # which is this from the mogene... library GO2genes = as.list(mogene10sttranscriptclusterGO2PROBE) ) My questions for the list are: 1) is this OK? Will the results I get from a Fisher Test be valid? They /seem/ fine. 2) if this is valid, would it be worth making it clearer in the topGO documentation? It is specified that allGenes should be a vector of strings, or a named numerical vector. Cheers, Mike Dewar - - - Dr Michael Dewar Postdoctoral Research Scientist Applied Mathematics Columbia University http://www.columbia.edu/~md2954/ [[alternative HTML version deleted]]
Microarray GO topGO Microarray GO topGO • 1.7k views
ADD COMMENT
0
Entering edit mode
Adrian Alexa ▴ 400
@adrian-alexa-936
Last seen 10.4 years ago
Hi Mike, your code look good and you should get the correct results. did you read the package vignette? Your issues are explained there. If you still have trouble understanding a specific section please let me know. Best regards, Adrian On Wed, Aug 11, 2010 at 3:56 PM, Mike Dewar <mike.dewar at="" columbia.edu=""> wrote: > Hi, > > I've been trying to figure out the usage of topGO for situations where all you have is a list of interesting probesets, along with all the probesets on a microarray. My problems, which have been discussed on the biostar webpage, seem to have been around the specification of geneSel and allGenes. I have ended up not specifying geneSel at all (which would normally be some test to do with a p-value or other score) and to specify allGenes as a named factor, where the probeset is 1 if I consider it interesting, and 0 otherwise. In the code snippet below, `exprset` is an ExpressionSet object and `interesting_genes` is the list of probesets I find interesting. > > library(topGO) > library(mogene10sttranscriptcluster.db) > load('exprset') > load('interesting_genes') > all_genes <- rownames(exprs(exprset)) > # then make a factor that is 1 if the probeset is "interesting" and 0 otherwise > geneList <- factor(as.integer (all_genes %in% interesting_genes)) > # name the factor with the probeset names > names (geneList) <- allGenes > # form the GOdata object > GOdata <-new ("topGOdata", > ? ?ontology = "BP", > ? ?allGenes = geneList, > ? ?nodeSize = 5, > ? ?# annot, tells topGO to map from GO terms to "genes" > ? ?annot = annFUN.GO2genes, > ? ?# so annot then calls something to perform this mapping GO2genes, > ? ?# which is this from the mogene... library > ? ?GO2genes = as.list(mogene10sttranscriptclusterGO2PROBE) > ) > > My questions for the list are: > > 1) is this OK? Will the results I get from a Fisher Test be valid? They /seem/ fine. > 2) if this is valid, would it be worth making it clearer in the topGO documentation? It is specified that allGenes should be a vector of strings, or a named numerical vector. > > Cheers, > > Mike Dewar > > - - - > Dr Michael Dewar > Postdoctoral Research Scientist > Applied Mathematics > Columbia University > http://www.columbia.edu/~md2954/ > > > > > > > > ? ? ? ?[[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor >
ADD COMMENT

Login before adding your answer.

Traffic: 812 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6