How to create a GO2gene object for topGO?
1
0
Entering edit mode
Quin Wills ▴ 100
@quin-wills-2709
Last seen 10.3 years ago
Hello all I have some significant Illumina v1 gene expression probes (and their probe 'universe') I want to run GO enrichment analysis on. I assume that: (i) I need illumnaHumanv1.db for the GO2PROBE mappings (ii) I need to to create a GO2gene object for input into topGO as: new("topGOdata", ontology="BP", allGenes=my.probe.list, annot=annFUN.GO2genes, GO2gene=my.GO2gene) I'm just not joining the mental dots dots between (i) and (ii). Or am I completely missing the point? Any quick/simple guidance to get from my probes to a topGOdata object would be very, very welcome - thanks! Quin ** * * * * *Quin Wills* *DPhil candidate* * * *Department of Statistics* *University** of Oxford*** *1 South Parks Road* *Oxford*** *OX1 3TG United Kingdom* *01865 285 394* [[alternative HTML version deleted]]
GO topGO GO topGO • 2.6k views
ADD COMMENT
0
Entering edit mode
@michael-watson-iah-c-378
Last seen 10.3 years ago
This code might work. In this example, my data is in a data.frame, called array2go. The column "Accn" contains the identifier for spots on the array The column "GO_ID" contains the GO identifier. The column "Category" contains the GO category. geneNames is all Accns on the array sigGenes is the significant set mf <- array2go[array2go$Category=="Function",] mygene2GO <- sapply(unique(as.vector(mf$Accn)), function(x) as.character(unique(mf$GO_ID[mf$Accn==x]))) geneNames <- unique(array2go$Accn) sigGenes # this comes from somewhere! geneList <- factor(as.integer(geneNames %in% sigGenes)) names(geneList) <- geneNames GOdata <- new("topGOdata", ontology="MF", allGenes=geneList, annot=annFUN.gene2GO, gene2GO=mygene2GO) test.stat <- new("classicCount", testStatistic=GOFisherTest, name="Fisher Test") resultFis <- getSigGroups(GOdata, test.stat) res <- GenTable(GOdata, classic=resultFis, topNodes=288) res[1:10,] -----Original Message----- From: bioconductor-bounces@stat.math.ethz.ch [mailto:bioconductor-bounces at stat.math.ethz.ch] On Behalf Of Quin Wills Sent: 23 October 2008 23:18 To: bioconductor at stat.math.ethz.ch Subject: [BioC] How to create a GO2gene object for topGO? Hello all I have some significant Illumina v1 gene expression probes (and their probe 'universe') I want to run GO enrichment analysis on. I assume that: (i) I need illumnaHumanv1.db for the GO2PROBE mappings (ii) I need to to create a GO2gene object for input into topGO as: new("topGOdata", ontology="BP", allGenes=my.probe.list, annot=annFUN.GO2genes, GO2gene=my.GO2gene) I'm just not joining the mental dots dots between (i) and (ii). Or am I completely missing the point? Any quick/simple guidance to get from my probes to a topGOdata object would be very, very welcome - thanks! Quin ** * * * * *Quin Wills* *DPhil candidate* * * *Department of Statistics* *University** of Oxford*** *1 South Parks Road* *Oxford*** *OX1 3TG United Kingdom* *01865 285 394* [[alternative HTML version deleted]] _______________________________________________ Bioconductor mailing list Bioconductor at stat.math.ethz.ch https://stat.ethz.ch/mailman/listinfo/bioconductor Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
ADD COMMENT
0
Entering edit mode
Hello all, I have been looking at the mdqc package for automatic quality assessment of a large set of Affy SNP 6.0 data. I have already generated a set of QC stats using Affy's own software and they exclude outlier arrays using a fixed cut-off of the contrast QC scores (basically a measure of how separated the three genotype clouds are). I wanted to see if mdqc would give me the same answers. Here are some of the contrast QC scores for the first 6 arrays (out of 140). A value less than 0.4 in any of these columns could be a quality problem according to Affy. > allQC[1:6,] Contrast.QC Contrast.QC..Random. Contrast.QC..Nsp. Contrast.QC..Sty. Contrast.QC..Nsp.Sty.Overlap. 1 0.72 0.72 0.79 1.00 1.38 2 0.42 0.42 0.72 0.35 0.99 3 1.08 1.08 0.97 1.28 1.30 4 0.50 0.50 0.75 0.79 0.64 5 0.00 0.00 0.00 -0.22 0.00 6 0.47 0.47 0.76 0.49 0.71 As you can see Array 5 is clearly an outlier (<0.4) in all 5 columns and we flagged it as such originally. However, when running mdqc, it does not call array 5 an outlier at the greatest significance level. Intuitively I would expect this array to have the most extreme quality measure. > mout=mdqc(allQC) > mout Method used: nogroups Number of groups: 1 Robust estimator: S-estimatorMDs exceeding the square root of the 90 % percentile of the Chi-Square distribution [1] 5 8 14 16 48 63 75 78 81 86 91 114 117 122 126 131 132 134 137 138 MDs exceeding the square root of the 95 % percentile of the Chi- Square distribution [1] 5 8 14 48 75 78 81 86 91 114 122 126 131 132 137 138 MDs exceeding the square root of the 99 % percentile of the Chi- Square distribution [1] 48 78 81 86 122 126 131 137 138 Which leads me (finally!) to my questions:- -Is mdqc getting confused by the fact that array 5 is consistently low in all qc measures? -Does mdqc automatically assume that higher values indicate lower array quality or vice-versa? Many thanks in advance for any input, Cheers, Mark PS here is my sessionInfo() > sessionInfo() R version 2.8.0 alpha (2008-10-04 r46598) i386-pc-mingw32 locale: LC_COLLATE=English_United Kingdom.1252;LC_CTYPE=English_United Kingdom.1252;LC_MONETARY=English_United Kingdom.1252;LC_NUMERIC=C;LC_TIME=English_United Kingdom.1252 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] mdqc_1.4.0 MASS_7.2-44 cluster_1.11.11
ADD REPLY
0
Entering edit mode
Thanks a stack, Michael... that's crystal clear. Mental dots are joined and I think can take it from here. The topGO documentation was just really not clear. Quin michael watson (IAH-C) wrote: > This code might work. > > In this example, my data is in a data.frame, called array2go. > > The column "Accn" contains the identifier for spots on the array > The column "GO_ID" contains the GO identifier. > The column "Category" contains the GO category. > > geneNames is all Accns on the array > sigGenes is the significant set > > mf <- array2go[array2go$Category=="Function",] > mygene2GO <- sapply(unique(as.vector(mf$Accn)), > function(x) > as.character(unique(mf$GO_ID[mf$Accn==x]))) > geneNames <- unique(array2go$Accn) > sigGenes # this comes from somewhere! > > geneList <- factor(as.integer(geneNames %in% sigGenes)) > names(geneList) <- geneNames > > GOdata <- new("topGOdata", > ontology="MF", > allGenes=geneList, > annot=annFUN.gene2GO, > gene2GO=mygene2GO) > > test.stat <- new("classicCount", > testStatistic=GOFisherTest, > name="Fisher Test") > > resultFis <- getSigGroups(GOdata, test.stat) > > res <- GenTable(GOdata, classic=resultFis, topNodes=288) > res[1:10,] > > -----Original Message----- > From: bioconductor-bounces@stat.math.ethz.ch > [mailto:bioconductor-bounces@stat.math.ethz.ch] On Behalf Of Quin Wills > Sent: 23 October 2008 23:18 > To: bioconductor@stat.math.ethz.ch > Subject: [BioC] How to create a GO2gene object for topGO? > > Hello all > > I have some significant Illumina v1 gene expression probes (and their > probe 'universe') I want to run GO enrichment analysis on. > > I assume that: > (i) I need illumnaHumanv1.db for the GO2PROBE mappings > (ii) I need to to create a GO2gene object for input into topGO as: > new("topGOdata", ontology="BP", allGenes=my.probe.list, > annot=annFUN.GO2genes, GO2gene=my.GO2gene) > > I'm just not joining the mental dots dots between (i) and (ii). Or am I > completely missing the point? Any quick/simple guidance to get from my > probes to a topGOdata object would be very, very welcome - thanks! > > Quin ** > > * * > > * * > > *Quin Wills* > *DPhil candidate* > > * * > > *Department of Statistics* > > *University** of Oxford*** > > *1 South Parks Road* > *Oxford*** > > *OX1 3TG > United Kingdom* > > > > *01865 285 394* > > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor@stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > -- * * * * * * *Quin Wills* *DPhil candidate* * * *Department of Statistics* *University** of Oxford*** *1 South Parks Road* *Oxford*** *OX1 3TG United Kingdom* *01865 285 394* [[alternative HTML version deleted]]
ADD REPLY

Login before adding your answer.

Traffic: 456 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6