GoStats
2
0
Entering edit mode
Auer Michael ▴ 250
@auer-michael-953
Last seen 9.6 years ago
When applying GOHyperG to a series of Locus IDs, the function only reports the GO ID and the corresponding frequencies. But what most people want to have, when interpreting the GO results is the corresponding Affy IDS which map on the different Locus IDs and GO IDS. Code I use Locusids<-mget(AffyIDS,env=hgu133aLOCUSID) /*retrieves the LOCUS LINK IDS from a series of AffyIDS*/ MF<- GOHyperG(Locusids,lib="hgu133a",what="MF") /*reports the occurring Gene Ontologies, the frequencies and the hypergeometric p values*/ How can I get a result of the form 1. GO ID, freq (output of GoHyperG) 2. GO ID, LOCUS ID (multiple occurring GO IDs) 3. LOCUS ID, AFFY ID (multiple occurring LOCUS IDs) I know that there is the function hgu95av2GO2ALLPROBES$"GO:0000166" which retrieves all the probes mapping on a GO, but still it is not the result. Ok, one would have to make a for loop and look for the occurring probes in the sample. But isn't there a better way???? Thanks a lot Michael Auer
GO affy GO affy • 1.2k views
ADD COMMENT
0
Entering edit mode
rgentleman ★ 5.5k
@rgentleman-7725
Last seen 9.0 years ago
United States
On Thu, Oct 07, 2004 at 06:25:45PM +0200, Auer Michael wrote: > When applying GOHyperG to a series of Locus IDs, the function only reports > the GO ID and the corresponding frequencies. But what most people want to > have, when interpreting the GO results is the corresponding Affy IDS which > map on the different Locus IDs and GO IDS. > I will add this feature to the next release. You will then get the GO ID and the set of LocusLink IDs that are mapped to that GO ID. You would then use some other tools (like those you suggest below) to find the related Affymetrix (or other manufacturer IDs). > Code I use > > Locusids<-mget(AffyIDS,env=hgu133aLOCUSID) > /*retrieves the LOCUS LINK IDS from a series of AffyIDS*/ > > MF<- GOHyperG(Locusids,lib="hgu133a",what="MF") > /*reports the occurring Gene Ontologies, the frequencies and the > hypergeometric p values*/ > > How can I get a result of the form > > 1. GO ID, freq (output of GoHyperG) > > 2. GO ID, LOCUS ID (multiple occurring GO IDs) > > 3. LOCUS ID, AFFY ID (multiple occurring LOCUS IDs) > What you would need to do now, is to take each of the set of LocusLink IDs you supplied to the call to GOHyperG, and the output set of GO IDs and find which LL's are linked to which GO IDs. This is done in GOALLLOCUSID, in GO 1.6.4. Finally you would mapp the LLs to whatever manufacturer IDs you have. That would be done from the manufacturer specific data package. > > I know that there is the function hgu95av2GO2ALLPROBES$"GO:0000166" > which > retrieves all the probes mapping on a GO, but still it is not the result. > Ok, one would have to make a for loop and look for the occurring probes in > the sample. But isn't there a better way???? I am not sure what you mean by a better way? Nor how it would necessarily answer your question. > > > Thanks a lot > > Michael Auer > > _______________________________________________ > Bioconductor mailing list > Bioconductor@stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor -- +--------------------------------------------------------------------- ------+ | Robert Gentleman phone : (617) 632-5250 | | Associate Professor fax: (617) 632-2444 | | Department of Biostatistics office: M1B20 | | Harvard School of Public Health email: rgentlem@jimmy.harvard.edu | +--------------------------------------------------------------------- ------+
ADD COMMENT
0
Entering edit mode
hi michael In case you're impatient: i wrote a function (it's at the bottom of this message) that sort of does what you want automatically. I need to warn you beforehand: i am a biologist and a beginner at R so it's not perfect at all! The table you get as an output will need some cleaning up as you will see. I do that using an excel macro, but it should be possible in R, i just don't know how. In this function, MATR is my matrix with gene expression changes of 8000 genes over over time, so replace that with your own. The first column is the spot identifier (so probename in the case of an affy array). Im using my own annotation package called 'floor', so replace that with your affy package. Use GOtable("0006818") If you want to know which genes belong to the 'hydrogen transport' class, for instance. The output is a textfile in the form of table with all genes that belong to the GO you specified with the feature identifier and the genename and its regulation that is in the matrix. The last row is the name of the GO and the amount of representing genes on the array. The name of the textfile is the GO identifier. if you change it for the better, please let me know. I suppose it should be possible to use this function in a sapply command to execute it on a list of overrepresented GOs. I also made a function that makes tables of overrepresented GO identifiers and corresponding p values based on a matrix of gene x condition that exists of 0 (for not sign. regulated) and 1 (for sign regulated genes). Let me know if you want it. Have Fun. Floor GOtable<-function(GOID){ GObase<-paste("GO:", GOID, sep = "") filename<-paste("GO_", GOID, ".xls", sep = "") tmpv<-mget(GObase, env=floorGO2ALLPROBES, ifnotfound="NA") tmp<-unlist(tmpv) tmpindex<-as.integer(tmp) index <- MATR[,1] %in% tmpindex subset <- MATR[index,] tmp<-unique(sort(as.integer(tmp))) tmp<-as.character(tmp) tmp<-mget(tmp, env=floorGENENAME, ifnotfound="NA") tmp<-as.matrix(tmp) colnames(tmp)=list("GeneName") data<-cbind(tmp, subset) bla<-get(GObase, env=GOTERM) rows<-(dim(data))[1] rows1<-rows+1 data[rows1,1]<-bla write.table(data, file=filename, sep="\t") } _______________________________________________________ Floor Stam Vrije Universiteit Amsterdam Faculty of Earth and Life Sciences Department of Molecular and Cellular Neurobiology De Boelelaan 1085 1081HV Amsterdam The Netherlands Ph: +31-20-4447114 +31-20-5665512 Fax: +31-20-4447112 e-mail: fjstam@bio.vu.nl _______________________________________________________ On 9 Oct 2004 , at 5:26, Robert Gentleman wrote: > On Thu, Oct 07, 2004 at 06:25:45PM +0200, Auer Michael wrote: >> When applying GOHyperG to a series of Locus IDs, the function only >> reports >> the GO ID and the corresponding frequencies. But what most people >> want to >> have, when interpreting the GO results is the corresponding Affy IDS >> which >> map on the different Locus IDs and GO IDS. >> > > I will add this feature to the next release. > You will then get the GO ID and the set of LocusLink IDs that are > mapped to that GO ID. You would then use some other tools (like > those you suggest below) to find the related Affymetrix (or other > manufacturer IDs). > > >> Code I use >> >> Locusids<-mget(AffyIDS,env=hgu133aLOCUSID) >> /*retrieves the LOCUS LINK IDS from a series of AffyIDS*/ >> >> MF<- GOHyperG(Locusids,lib="hgu133a",what="MF") >> /*reports the occurring Gene Ontologies, the frequencies and the >> hypergeometric p values*/ >> >> How can I get a result of the form >> >> 1. GO ID, freq (output of GoHyperG) >> >> 2. GO ID, LOCUS ID (multiple occurring GO IDs) >> >> 3. LOCUS ID, AFFY ID (multiple occurring LOCUS IDs) >> > > What you would need to do now, is to take each of the set of > LocusLink IDs you supplied to the call to GOHyperG, and the output > set of GO IDs and find which LL's are linked to which GO IDs. This > is done in GOALLLOCUSID, in GO 1.6.4. > > Finally you would mapp the LLs to whatever manufacturer IDs you > have. That would be done from the manufacturer specific data > package. > >> >> I know that there is the function hgu95av2GO2ALLPROBES$"GO:0000166" >> which >> retrieves all the probes mapping on a GO, but still it is not the >> result. >> Ok, one would have to make a for loop and look for the occurring >> probes in >> the sample. But isn't there a better way???? > > I am not sure what you mean by a better way? Nor how it would > necessarily answer your question. > > >> >> >> Thanks a lot >> >> Michael Auer >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor@stat.math.ethz.ch >> https://stat.ethz.ch/mailman/listinfo/bioconductor > > -- > +------------------------------------------------------------------- --- > -----+ > | Robert Gentleman phone : (617) 632-5250 > | > | Associate Professor fax: (617) 632-2444 > | > | Department of Biostatistics office: M1B20 > | > | Harvard School of Public Health email: rgentlem@jimmy.harvard.edu > | > +------------------------------------------------------------------- --- > -----+ > > _______________________________________________ > Bioconductor mailing list > Bioconductor@stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor >
ADD REPLY
0
Entering edit mode
rgentleman ★ 5.5k
@rgentleman-7725
Last seen 9.0 years ago
United States
On Thu, Oct 07, 2004 at 06:25:45PM +0200, Auer Michael wrote: > When applying GOHyperG to a series of Locus IDs, the function only reports > the GO ID and the corresponding frequencies. But what most people want to > have, when interpreting the GO results is the corresponding Affy IDS which > map on the different Locus IDs and GO IDS. > > Code I use > > Locusids<-mget(AffyIDS,env=hgu133aLOCUSID) > /*retrieves the LOCUS LINK IDS from a series of AffyIDS*/ > > MF<- GOHyperG(Locusids,lib="hgu133a",what="MF") > /*reports the occurring Gene Ontologies, the frequencies and the > hypergeometric p values*/ > > How can I get a result of the form > > 1. GO ID, freq (output of GoHyperG) > > 2. GO ID, LOCUS ID (multiple occurring GO IDs) > > 3. LOCUS ID, AFFY ID (multiple occurring LOCUS IDs) > > > I know that there is the function hgu95av2GO2ALLPROBES$"GO:0000166" which > retrieves all the probes mapping on a GO, but still it is not the result. > Ok, one would have to make a for loop and look for the occurring probes in > the sample. But isn't there a better way???? Hi, I have updated the return value for the package, I hope that comes close to addressing your concerns. GO maps to LocusLink, so it does not make sense for this function to start with Affy (or at least I don't think it does). So if you have some affy ids, you can use the LOCUSID environment to get the locus link values. [if you have your affy id's in a vector x, you can go y = unlist(mget(x, hgu95av2LOCUSID)) names(y) = NULL byLL = split(x, y) ##gets Affy id by locuslink;] Given a unique set of LocusLink IDs you can call GOHyperG, and now in the return value you will get a mapping from GO id to all Affy IDs, at that probe. You can then use other things - to reduce these to unique LocusLink etc) So, if we presume the output of GOHyperG is in the variable xx (as it will be if you just use exampel(GOHyperG), you will now see (or when you get the new version) > names(xx) [1] "pvalues" "goCounts" "chip" "go2Affy" "intCounts" "numLL" [7] "numInt" "intLLs" and now we can use go2Affy to find the GO to LL maps myGO2LL = lapply(xx$go2Affy, function(x) unique(unlist(mget(x, hgu95av2LOCUSID)))) I think that gets pretty close to what you wanted, but let me know if it is not yet correct. Robert > > > Thanks a lot > > Michael Auer > > _______________________________________________ > Bioconductor mailing list > Bioconductor@stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor -- +--------------------------------------------------------------------- ------+ | Robert Gentleman phone : (617) 632-5250 | | Associate Professor fax: (617) 632-2444 | | Department of Biostatistics office: M1B20 | | Harvard School of Public Health email: rgentlem@jimmy.harvard.edu | +--------------------------------------------------------------------- ------+
ADD COMMENT

Login before adding your answer.

Traffic: 665 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6