GoStats
1
0
Entering edit mode
David ▴ 860
@david-3335
Last seen 6.1 years ago
Hi , I'm a bit confused in the way of using my data. My input is a list of genes( in fact a list of targeted genes for microRNAs). The first step is to get the GoTerms associated to these genes and then i would like to do hyperg to obtain significant dysregulated Goterms. ALl the examples i went through use affy data or so so i'm not sure this is correct. I would appreciate your feedback library("GOstats") library("GSEABase") library(org.Hs.eg.db) data="genes.txt" # A list of genes ( "MED13" "ENDOD1" "RAP2C" "ACSL1" ...) g=read.table(file=data) genes <- as.character(g[,1]) # Get Mapping to GO frame<-merge(toTable(org.Hs.egALIAS2EG[genes]), toTable(org.Hs.egGO), by.x= "gene_id", by.y="gene_id") goframeData = data.frame(frame$go_id, frame$Evidence, frame$gene_id) goFrame = GOFrame(goframeData, organism = "Homo sapiens") goAllFrame = GOAllFrame(goFrame) #From here i'm a bit confused. Since i have my list of Goterms do i need to use the universe data ?? or do i apply a hyperg on the above data. Thanks for your input. gsc <- GeneSetCollection(goAllFrame, setType = GOCollection()) universe = Lkeys(org.Hs.egGO) params <- GSEAGOHyperGParams(name = "My Custom GSEA based annot Params",geneSetCollection = gsc, geneIds = unique(frame$gene_id), universeGeneIds = universe,ontology = "BP", pvalueCutoff = 0.05, conditional = FALSE,testDirection = "over")
Organism affy Organism affy • 1.5k views
ADD COMMENT
0
Entering edit mode
Marc Carlson ★ 7.2k
@marc-carlson-2264
Last seen 7.7 years ago
United States
Hi David, You need to have 3 things to do this analysis. You need 1) a list of "interesting" genes (presumably your list of targeted genes for microRNAs?), 2) a list of ALL the genes that you tested (aka the gene universe) and 3) the annotations that map the GO terms to all genes (and not just the ones that you think are interesting either - we want ALL of the annotations). It is this third thing, (the annotations of GO IDs to all your possible genes) that you need to make into a GOALLFrame object. Once you have that, you will basically be testing the 1st thing (list of interesting genes) for enrichment in your 2nd thing (gene universe). The essence of the test is to use the annotations to ask if there are GO terms that are over or under represented in your "interesting" list of genes relative to the list of all possible genes (the gene universe). The usage of this is described in the vignette titled "Hypergeometric tests for less common model organisms" in the GOstats package which you can read here: http://www.bioconductor.org/help/bioc- views/devel/bioc/html/GOstats.html Does that help clear things up? Marc On 03/29/2011 06:33 AM, David martin wrote: > Hi , > I'm a bit confused in the way of using my data. > > My input is a list of genes( in fact a list of targeted genes for > microRNAs). The first step is to get the GoTerms associated to these > genes and then i would like to do hyperg to obtain significant > dysregulated Goterms. ALl the examples i went through use affy data or > so so i'm not sure this is correct. I would appreciate your feedback > > > library("GOstats") > library("GSEABase") > library(org.Hs.eg.db) > > > data="genes.txt" # A list of genes ( "MED13" "ENDOD1" "RAP2C" > "ACSL1" ...) > g=read.table(file=data) > genes <- as.character(g[,1]) > > # Get Mapping to GO > frame<-merge(toTable(org.Hs.egALIAS2EG[genes]), toTable(org.Hs.egGO), > by.x= "gene_id", by.y="gene_id") > > goframeData = data.frame(frame$go_id, frame$Evidence, frame$gene_id) > goFrame = GOFrame(goframeData, organism = "Homo sapiens") > goAllFrame = GOAllFrame(goFrame) > > > #From here i'm a bit confused. Since i have my list of Goterms do i > need to use the universe data ?? or do i apply a hyperg on the above > data. Thanks for your input. > > gsc <- GeneSetCollection(goAllFrame, setType = GOCollection()) > universe = Lkeys(org.Hs.egGO) > params <- GSEAGOHyperGParams(name = "My Custom GSEA based annot > Params",geneSetCollection = gsc, geneIds = unique(frame$gene_id), > universeGeneIds = universe,ontology = "BP", pvalueCutoff = 0.05, > conditional = FALSE,testDirection = "over") > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor
ADD COMMENT
0
Entering edit mode
Ok i think that was pretty clear. I'm missing all GO terms mapped to all my genes in the GOALLFRAME. I'll give it another try now and come back to the forum if i encounter any problem. thanks On 03/30/2011 02:04 AM, Marc Carlson wrote: > Hi David, > > You need to have 3 things to do this analysis. You need 1) a list of > "interesting" genes (presumably your list of targeted genes for > microRNAs?), 2) a list of ALL the genes that you tested (aka the gene > universe) and 3) the annotations that map the GO terms to all genes (and > not just the ones that you think are interesting either - we want ALL of > the annotations). It is this third thing, (the annotations of GO IDs to > all your possible genes) that you need to make into a GOALLFrame object. > Once you have that, you will basically be testing the 1st thing (list of > interesting genes) for enrichment in your 2nd thing (gene universe). The > essence of the test is to use the annotations to ask if there are GO > terms that are over or under represented in your "interesting" list of > genes relative to the list of all possible genes (the gene universe). > > The usage of this is described in the vignette titled "Hypergeometric > tests for less common model organisms" in the GOstats package which you > can read here: > > http://www.bioconductor.org/help/bioc- views/devel/bioc/html/GOstats.html > > > Does that help clear things up? > > > Marc > > > > On 03/29/2011 06:33 AM, David martin wrote: >> Hi , >> I'm a bit confused in the way of using my data. >> >> My input is a list of genes( in fact a list of targeted genes for >> microRNAs). The first step is to get the GoTerms associated to these >> genes and then i would like to do hyperg to obtain significant >> dysregulated Goterms. ALl the examples i went through use affy data or >> so so i'm not sure this is correct. I would appreciate your feedback >> >> >> library("GOstats") >> library("GSEABase") >> library(org.Hs.eg.db) >> >> >> data="genes.txt" # A list of genes ( "MED13" "ENDOD1" "RAP2C" "ACSL1" >> ...) >> g=read.table(file=data) >> genes <- as.character(g[,1]) >> >> # Get Mapping to GO >> frame<-merge(toTable(org.Hs.egALIAS2EG[genes]), toTable(org.Hs.egGO), >> by.x= "gene_id", by.y="gene_id") >> >> goframeData = data.frame(frame$go_id, frame$Evidence, frame$gene_id) >> goFrame = GOFrame(goframeData, organism = "Homo sapiens") >> goAllFrame = GOAllFrame(goFrame) >> >> >> #From here i'm a bit confused. Since i have my list of Goterms do i >> need to use the universe data ?? or do i apply a hyperg on the above >> data. Thanks for your input. >> >> gsc <- GeneSetCollection(goAllFrame, setType = GOCollection()) >> universe = Lkeys(org.Hs.egGO) >> params <- GSEAGOHyperGParams(name = "My Custom GSEA based annot >> Params",geneSetCollection = gsc, geneIds = unique(frame$gene_id), >> universeGeneIds = universe,ontology = "BP", pvalueCutoff = 0.05, >> conditional = FALSE,testDirection = "over") >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at r-project.org >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: >> http://news.gmane.org/gmane.science.biology.informatics.conductor > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor >
ADD REPLY

Login before adding your answer.

Traffic: 821 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6