HyperGTest and genelists

0

Entering edit mode

davidl@unr.nevada.edu ▴ 140

@davidlunrnevadaedu-1371

Last seen 11.3 years ago

Hello everyone, I like that the new HyperGTest function lets you specify the gene universe, get only the more specific GO terms in your results, and easily output a report with the expected counts, actual counts, and p values. With the old GOHyperG results, I had written a function that retrieved the list of significant GOterms, found the gene names associated with those GO terms, and found the intersection of those gene names and my genes of interest. So my question is this: Is there a way to get the gene names of the genes of interest which were associated with the over or under represented GO terms found with HyperGTest? I noticed there is a function for the GOHyperGResult object (geneIdUniverse) which retrieves the entrez gene identifiers from your gene universe for all the tested GO terms. Is there a way to get only the entrez gene identifiers from your genes-of-interest group? Could you then filter out the GO terms which did not meet the p-value cutoff? Is there a function which could be applied to this list to change those entrez identifiers into gene names? Or is there an easier way to get the names of the genes from your genes of interest which contributed to the "Count" column of the html report? For example, if there was a method for retrieving the GO terms which met the p-value cutoff from the GOHyperGResult object, I could just use the function I've already written. Thank you very much in advance for whatever help you can provide. I really like the new hypergeometric function, and I hope this can be figured out. Thank you, Dave

GO GO • 1.1k views

ADD COMMENT • link updated 18.7 years ago by Seth Falcon ★ 7.4k • written 18.7 years ago by davidl@unr.nevada.edu ▴ 140

0

Entering edit mode

James W. MacDonald 68k

@james-w-macdonald-5106

Last seen 4 hours ago

United States

Hi David, Does probeSetSummary() do what you want? Best, Jim davidl at unr.nevada.edu wrote: > Hello everyone, > > I like that the new HyperGTest function lets you specify the gene > universe, get only the more specific GO terms in your results, and easily > output a report with the expected counts, actual counts, and p values. With > the old GOHyperG results, I had written a function that retrieved the list of > significant GOterms, found the gene names associated with those GO terms, and > found the intersection of those gene names and my genes of interest. > So my question is this: > > Is there a way to get the gene names of the genes of interest which were > associated with the over or under represented GO terms found with HyperGTest? > I noticed there is a function for the GOHyperGResult object (geneIdUniverse) > which retrieves the entrez gene identifiers from your gene universe for all the > tested GO terms. Is there a way to get only the entrez gene identifiers from > your genes-of-interest group? Could you then filter out the GO terms which did > not meet the p-value cutoff? Is there a function which could be applied to > this list to change those entrez identifiers into gene names? Or is there an > easier way to get the names of the genes from your genes of interest which > contributed to the "Count" column of the html report? For example, if there > was a method for retrieving the GO terms which met the p-value cutoff from the > GOHyperGResult object, I could just use the function I've already written. > Thank you very much in advance for whatever help you can provide. I really > like the new hypergeometric function, and I hope this can be figured out. > > Thank you, > Dave > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor -- James W. MacDonald, M.S. Biostatistician Affymetrix and cDNA Microarray Core University of Michigan Cancer Center 1500 E. Medical Center Drive 7410 CCGC Ann Arbor MI 48109 734-647-5623 ********************************************************** Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues.

ADD COMMENT • link 18.7 years ago James W. MacDonald 68k

0

Entering edit mode

Seth Falcon ★ 7.4k

@seth-falcon-992

Last seen 11.3 years ago

Hi Dave, Have you had a look at the new vignette in the devel version of GOstats? If not, it might help with some of your questions. You can find it here: http://www.bioconductor.org/packages/2.0/bioc/html/GOstats.html davidl at unr.nevada.edu writes: > I like that the new HyperGTest function lets you specify the gene > universe, get only the more specific GO terms in your results, and easily > output a report with the expected counts, actual counts, and p values. With > the old GOHyperG results, I had written a function that retrieved the list of > significant GOterms, found the gene names associated with those GO terms, and > found the intersection of those gene names and my genes of interest. > So my question is this: > > Is there a way to get the gene names of the genes of interest which were > associated with the over or under represented GO terms found with > HyperGTest? > I noticed there is a function for the GOHyperGResult object (geneIdUniverse) > which retrieves the entrez gene identifiers from your gene universe for all the > tested GO terms. Is there a way to get only the entrez gene identifiers from > your genes-of-interest group? I don't think that GOstats currently provides the exact features you want, but in the devel version there is some progress along these lines... sigCategories(hgOver, p) will list the GO IDs that were significant given p-value cutoff of p. selectedGenes(hgOver, id) will return a list with an element for each GO ID given in id containing the Entrez IDs that are in the intersection of the GO term and the selected gene ID list. So I think you want: selectedGenes(hgOver, sigCategories(hgOver)) Then you have to convert the Entrez IDs to gene symbols. We hope to be adding some functions to the annotate package to make these sorts of transformations easy... > Could you then filter out the GO terms which did > not meet the p-value cutoff? See the vignette, there are a number of ways to do this. > Is there a function which could be applied to > this list to change those entrez identifiers into gene names? Not yet, but I like the idea. > Or is there an > easier way to get the names of the genes from your genes of interest which > contributed to the "Count" column of the html report? For example, if there > was a method for retrieving the GO terms which met the p-value cutoff from the > GOHyperGResult object, I could just use the function I've already > written. Not sure what you mean by retrieving the GO terms. Perhaps this helps you? library("annotate") library("GO") sapply(mget(sigCategories(hgOver), GOTERM), Term) + seth PS: Feedback on selectedGenes and sigCategories is most welcome. These are new functions that I've been playing with and have not had a chance to finalize the interface and document... -- Seth Falcon | Computational Biology | Fred Hutchinson Cancer Research Center http://bioconductor.org

ADD COMMENT • link 18.7 years ago Seth Falcon ★ 7.4k

0

Entering edit mode

> > Is there a function which could be applied to > > this list to change those entrez identifiers into gene names? > > Not yet, but I like the idea. How about adding it in the xxxLLMappings annotations? It seems logical to me to have them organized the same as the chip annotations. It would also be useful to just match the mapping names to the annotations so they can be used directly with HyperGTest and other functionality (for example looking at the result of a genome-wide scan of promoter elements). Since the chip annotations basically go: probe -> EntrezID -> everything else I would guess it should not be too difficult to add "EntrezID -> everything else" in the xxxLLMappings. It might also be a way to simplify the generation of annotations if the probe -> EntrezID and EntrezID -> rest were separated, but I don't know how much work would be involved there. Francois

ADD REPLY • link 18.7 years ago Francois Pepin ★ 1.3k

Login before adding your answer.