hyperGTest html report
2
0
Entering edit mode
@sebastien-gerega-2229
Last seen 8.1 years ago
Hi, is there any way to get additional information into the hyperGTest html report? Specifically, I would like to include the Entrez IDs for the genes contributing to each overrepresented GO term. thanks, Sebastien
GO GO • 926 views
0
Entering edit mode
Seth Falcon ▴ 150
@seth-falcon-2443
Last seen 8.1 years ago
Hi Sebastien, Sebastien Gerega <seb at="" gerega.net=""> writes: > is there any way to get additional information into the hyperGTest > html report? Specifically, I would like to include the Entrez IDs > for the genes contributing to each overrepresented GO term. I don't think you've missed any options. If you want to enhance the output, you will need to write some code. The htmlReport method works in concert with the summary method for GOHyperGResult objects. The bulk of both methods is actually defined in the Category package. Have a look at Category/R/summary-methods.R. + seth -- Seth Falcon | seth at userprimary.net | blog: http://userprimary.net/user/
0
Entering edit mode
@james-w-macdonald-5106
Last seen 2 days ago
United States
Hi Sebastien, Maybe not directly, but note that htmlReport() is simply using xtable to create the HTML page using the output from summary(). So you could just create the table and then add a column of Entrez Gene IDs and then output the result. Say your GOHyperGResult object is called 'hypt': out <- summary(hyp, summary.args=list(htmlLinks=TRUE, categorySize=10)) Note that the categorySize argument isn't necessary, but does protect you from choosing arguably spurious results (like a GO term with 3 genes in the universe and 1 that was significant). Now you are going to have to create a vector containing all the Entrez Gene IDs for each GO term. For this to work in HTML, you will also need to separate each ID with a

EntreGeneID

, so you will need to either cat() or paste() things together. Once you have that, just add to the data.frame created above: out <- data.frame(out, entregeneidvector) xtab <- xtable(out, caption="A Caption", digits=rep(c(3,0), c(4,8))) print(xtab, type="html", file="A file name.html", caption.placement="top", sanitize.text.function=function(x) x, include.rownames=FALSE) HOWEVER, that might not really be what you want, as it will obviously be a bit of work, and could get really messy if there are dozens of Entrez Gene IDs for a particular GO term. An alternative is to output individual HTML tables for each GO term of interest that list out the probesets that contributed to the significance of that term. For that you might want to look at hyperGoutput() in the affycoretools package. Best, Jim Sebastien Gerega wrote: > Hi, > is there any way to get additional information into the hyperGTest html > report? > Specifically, I would like to include the Entrez IDs for the genes > contributing to > each overrepresented GO term. > thanks, > Sebastien > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor -- James W. MacDonald, M.S. Biostatistician Affymetrix and cDNA Microarray Core University of Michigan Cancer Center 1500 E. Medical Center Drive 7410 CCGC Ann Arbor MI 48109 734-647-5623
0
Entering edit mode
Thanks for that! I can now almost get what I want..... Here is the code I use: hgOver = hyperGTest(params) report = summary(hgOver, htmlLinks=TRUE) cats = sigCategories(hgOver) reportGenes = vector() for(i in 1:length(cats)){ reportGenes = append(reportGenes, geneIdsByCategory(hgOver, cats[i])) } This gives me reportGenes as a list something like this: $04650 [1] 10451 4277 5296 5880 6464 8743 8795 8797$04670 [1] 10451 1365 5296 5829 5880 6387 6494 87 9564 $00150 [1] 3291 51451 6715$04080 [1] 154 2150 4886 4923 7433 $04360 [1] 10512 1969 2043 56920 57522 57556 5880 6387 I would then like to run the following code: report <- data.frame(report, reportGenes) xtab <- xtable(report, caption="A Caption") print(xtab, type="html", file="Afile.html", caption.placement="top", sanitize.text.function=function(x) x, include.rownames=FALSE) But I get the following error: Error in data.frame("04650" = c(10451L, 4277L, 5296L, 5880L, 6464L, 8743L, : arguments imply differing number of rows: 8, 9, 3, 5, 7 How should I deal with this list so that I can add it to the data.frame? And are there any faster ways to do what I have done in this code? I am still getting used to R. thanks heaps, Sebastien James W. MacDonald wrote: > Hi Sebastien, > > Maybe not directly, but note that htmlReport() is simply using xtable > to create the HTML page using the output from summary(). So you could > just create the table and then add a column of Entrez Gene IDs and > then output the result. > > Say your GOHyperGResult object is called 'hypt': > > out <- summary(hyp, summary.args=list(htmlLinks=TRUE, categorySize=10)) > > Note that the categorySize argument isn't necessary, but does protect > you from choosing arguably spurious results (like a GO term with 3 > genes in the universe and 1 that was significant). > > Now you are going to have to create a vector containing all the Entrez > Gene IDs for each GO term. For this to work in HTML, you will also > need to separate each ID with a EntreGeneID , so you will need > to either cat() or paste() things together. Once you have that, just > add to the data.frame created above: > > out <- data.frame(out, entregeneidvector) > xtab <- xtable(out, caption="A Caption", digits=rep(c(3,0), c(4,8))) > print(xtab, type="html", file="A file name.html", > caption.placement="top", sanitize.text.function=function(x) x, > include.rownames=FALSE) > > HOWEVER, that might not really be what you want, as it will obviously > be a bit of work, and could get really messy if there are dozens of > Entrez Gene IDs for a particular GO term. An alternative is to output > individual HTML tables for each GO term of interest that list out the > probesets that contributed to the significance of that term. For that > you might want to look at hyperGoutput() in the affycoretools package. > > Best, > > Jim > > > Sebastien Gerega wrote: >> Hi, >> is there any way to get additional information into the hyperGTest >> html report? >> Specifically, I would like to include the Entrez IDs for the genes >> contributing to >> each overrepresented GO term. >> thanks, >> Sebastien >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at stat.math.ethz.ch >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: >> http://news.gmane.org/gmane.science.biology.informatics.conductor > ADD REPLY 0 Entering edit mode Sebastien Gerega wrote: > Thanks for that! > I can now almost get what I want..... > Here is the code I use: > > hgOver = hyperGTest(params) > report = summary(hgOver, htmlLinks=TRUE) > cats = sigCategories(hgOver) > reportGenes = vector() > > for(i in 1:length(cats)){ > reportGenes = append(reportGenes, geneIdsByCategory(hgOver, cats[i])) > } > > This gives me reportGenes as a list something like this: > >$04650 > [1] 10451 4277 5296 5880 6464 8743 8795 8797 > > $04670 > [1] 10451 1365 5296 5829 5880 6387 6494 87 9564 > >$00150 > [1] 3291 51451 6715 > > $04080 > [1] 154 2150 4886 4923 7433 > >$04360 > [1] 10512 1969 2043 56920 57522 57556 5880 6387 > > I would then like to run the following code: > > report <- data.frame(report, reportGenes) > xtab <- xtable(report, caption="A Caption") > print(xtab, type="html", file="Afile.html", caption.placement="top", > sanitize.text.function=function(x) x, include.rownames=FALSE) > > But I get the following error: > Error in data.frame("04650" = c(10451L, 4277L, 5296L, 5880L, 6464L, > 8743L, : > arguments imply differing number of rows: 8, 9, 3, 5, 7 This is the part where I said you have to wrap the Entrez Gene IDs in

EGID

so you can a.)have a vector of the correct length, and b.) create a table that will be readable. Something like this should suffice: rg.out <- sapply(reportGenes, function(x) paste("

", paste(x, collapse="

"), "

", sep="")) then use rg.out in lieu of reportGenes when making the data.frame. Best, Jim > > How should I deal with this list so that I can add it to the data.frame? > And are there any faster ways to do what I have done in this code? > I am still getting used to R. > thanks heaps, > Sebastien > > James W. MacDonald wrote: >> Hi Sebastien, >> >> Maybe not directly, but note that htmlReport() is simply using xtable >> to create the HTML page using the output from summary(). So you could >> just create the table and then add a column of Entrez Gene IDs and >> then output the result. >> >> Say your GOHyperGResult object is called 'hypt': >> >> out <- summary(hyp, summary.args=list(htmlLinks=TRUE, categorySize=10)) >> >> Note that the categorySize argument isn't necessary, but does protect >> you from choosing arguably spurious results (like a GO term with 3 >> genes in the universe and 1 that was significant). >> >> Now you are going to have to create a vector containing all the Entrez >> Gene IDs for each GO term. For this to work in HTML, you will also >> need to separate each ID with a

EntreGeneID

, so you will need >> to either cat() or paste() things together. Once you have that, just >> add to the data.frame created above: >> >> out <- data.frame(out, entregeneidvector) >> xtab <- xtable(out, caption="A Caption", digits=rep(c(3,0), c(4,8))) >> print(xtab, type="html", file="A file name.html", >> caption.placement="top", sanitize.text.function=function(x) x, >> include.rownames=FALSE) >> >> HOWEVER, that might not really be what you want, as it will obviously >> be a bit of work, and could get really messy if there are dozens of >> Entrez Gene IDs for a particular GO term. An alternative is to output >> individual HTML tables for each GO term of interest that list out the >> probesets that contributed to the significance of that term. For that >> you might want to look at hyperGoutput() in the affycoretools package. >> >> Best, >> >> Jim >> >> >> Sebastien Gerega wrote: >>> Hi, >>> is there any way to get additional information into the hyperGTest >>> html report? >>> Specifically, I would like to include the Entrez IDs for the genes >>> contributing to >>> each overrepresented GO term. >>> thanks, >>> Sebastien >>> >>> _______________________________________________ >>> Bioconductor mailing list >>> Bioconductor at stat.math.ethz.ch >>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>> Search the archives: >>> http://news.gmane.org/gmane.science.biology.informatics.conductor > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor -- James W. MacDonald, M.S. Biostatistician Affymetrix and cDNA Microarray Core University of Michigan Cancer Center 1500 E. Medical Center Drive 7410 CCGC Ann Arbor MI 48109 734-647-5623
0
Entering edit mode
> This is the part where I said you have to wrap the Entrez Gene IDs in >

EGID

so you can a.)have a vector of the correct length, and b.) > create a table that will be readable. > > Something like this should suffice: > > rg.out <- sapply(reportGenes, function(x) > paste("

", paste(x, collapse="

"), "

", sep="")) > > then use rg.out in lieu of reportGenes when making the data.frame. > > Best, > > Jim Great, thanks you very much for that. I am using the following code: report = summary(hgOver, htmlLinks=TRUE) cats = sigCategories(hgOver) reportGenes = vector() for(i in 1:length(cats)){ reportGenes = append(reportGenes, geneIdsByCategory(hgOver, cats[i])) } reportGenes = sapply(reportGenes, function(x) paste(x, collapse=", ")) report = data.frame(report, Genes=reportGenes) xtab = xtable(report, caption="Gene to GO MF test for over- representation") print(xtab, type="html", file="p2007_0031_T47D_48h_GO_MF.html", caption.placement="top", sanitize.text.function=function(x) x, include.rownames=FALSE) That way everything gets put in one line and it is a bit more compact. I don't suppose there is an easy way to make the gene IDs link to the Entrez website? cheers, Sebastien
0
Entering edit mode
Sebastien Gerega wrote: > >> This is the part where I said you have to wrap the Entrez Gene IDs in >>

EGID

so you can a.)have a vector of the correct length, and b.) >> create a table that will be readable. >> >> Something like this should suffice: >> >> rg.out <- sapply(reportGenes, function(x) >> paste("

", paste(x, collapse="

"), "

", sep="")) >> >> then use rg.out in lieu of reportGenes when making the data.frame. >> >> Best, >> >> Jim > Great, thanks you very much for that. I am using the following code: > > report = summary(hgOver, htmlLinks=TRUE) > cats = sigCategories(hgOver) > reportGenes = vector() > for(i in 1:length(cats)){ > reportGenes = append(reportGenes, geneIdsByCategory(hgOver, cats[i])) > } > reportGenes = sapply(reportGenes, function(x) paste(x, collapse=", ")) > report = data.frame(report, Genes=reportGenes) > xtab = xtable(report, caption="Gene to GO MF test for over- representation") > print(xtab, type="html", file="p2007_0031_T47D_48h_GO_MF.html", > caption.placement="top", sanitize.text.function=function(x) x, > include.rownames=FALSE) > > That way everything gets put in one line and it is a bit more compact. I > don't suppose there is an easy way to make the gene IDs link to the > Entrez website? Of course. This is just HTML. All you have to do is wrap the Entrez Gene IDs in XXXXXXX where XXXXX is the Entrez Gene ID. Best, Jim > cheers, > Sebastien -- James W. MacDonald, M.S. Biostatistician Affymetrix and cDNA Microarray Core University of Michigan Cancer Center 1500 E. Medical Center Drive 7410 CCGC Ann Arbor MI 48109 734-647-5623