Question: goTools: ontoCompare question
0
12.5 years ago by
Hello, I ran ontoCompare on the full list of probes in the mouse4302 genechip both with the default EndNodeList() and with a custom end node list containing only the antioxidant activity, biological_process, cellular_component, and molecular_function GO terms and found what appears to be a discrepency: > length(sviData$svi$ID) [1] 45101 > sviData$svi$ID[1:5] [1] "1452670_at" "1422340_a_at" "1452114_s_at" "1422644_at" "1423359_at" > listall<-list("allprobes"=sviData$svi$ID) > endlist<-c("GO:0003674", "GO:0005575", "GO:0008150", "GO:0016209") > totalAnnotations<-ontoCompare(listall, probeType="mouse4302", method="none") > write.table(totalAnnotations, file="totalAnnotations.txt") > totalAnnotations2<-ontoCompare(listall, probeType="mouse4302", method="none", endnode=endlist) > write.table(totalAnnotations2, file="totalAnnotations_reduced.txt") When finding the total possible number of annotations for the top level GO terms (BP, MF, CC), I got different numbers for the two approaches, but I got the same numbers for "NotFound" and "antioxidant activity": from totalAnnotations.txt antioxidant activity 127 biological_process 2594 cellular_component 2365 molecular_function 2414 NotFound 11120 ...others from totalAnnotations_reduced.txt antioxidant activity 127 biological_process 28020 cellular_component 28509 molecular_function 30875 NotFound 11120 I was just wondering if anyone knew why this might happen since it affects the interpretation of a comparison I was going to do. These data appear to reflect the histogram output from ontoPlot (so I don't think its an R->txt->excel thing). Is the output with method="none" the total number of times all probes are annotated at the endnode or at a child of the end node? Does it have something to do with the "isa" values in EndNodeList() or my method of creating endlist? R v.2.5.0 goTools v1.8.0 Cheers, Dave --and thank you Dick for recommending topGO. I found what I needed through that package.
go topgo • 559 views
modified 12.5 years ago by Paquet, Agnes500 • written 12.5 years ago by davidl@unr.nevada.edu140
0
12.5 years ago by
Paquet, Agnes500
Paquet, Agnes500 wrote:
Hi Dave, The current algorithm in ontoCompare is the following: - for each probe id in your list, retrieve all GO ids corresponding to this probe id - then, map these Go ids up to the end nodes provided as argument to the function (or the default ones) - Once the mapping is finished, add 1 to the count of each end node which was reached at least once (and not the number of times a node was hit, which explains the discrepancy in your example) For example, if I use only 1 Affy probe, and restrict everything to MF to simplify your example, ontoCompare will give me the following results; 1) using the default end nodes: > ontoCompare(list("1415670_at"),probeType="mouse4302",goType="MF",met hod="none") [1] "Starting ontoCompare..." [1] "Number of lists = 1" [1] "Using method: none" binding structural molecule activity transporter activity NotFound 1 1 1 0 (we have 1 count for each end node which was reached at least once) 2) Using your endlist > ontoCompare(list("1415670_at"),probeType="mouse4302",goType="MF",met hod="none",endnode=endlist) [1] "Starting ontoCompare..." [1] "Number of lists = 1" [1] "Using method: none" molecular_function NotFound 1 0 (same here, only 1 count for MF, and not 3) We made this choice because some nodes/probes may be more annotated than others, and it could make the relative comparison of 2 lists of probes appear more different based on the availability of annotations, and not true biological difference. You could also use the other methods to get number of hits relative to the number of probes or the number of GO in your list. I hope this will help, don't hesitate to email me again if you have more questions. Best, Agnes ________________________________ From: bioconductor-bounces@stat.math.ethz.ch on behalf of davidl@unr.nevada.edu Sent: Fri 6/29/2007 8:01 AM To: Bioconductor Subject: [BioC] goTools: ontoCompare question Hello, I ran ontoCompare on the full list of probes in the mouse4302 genechip both with the default EndNodeList() and with a custom end node list containing only the antioxidant activity, biological_process, cellular_component, and molecular_function GO terms and found what appears to be a discrepency: > length(sviData$svi$ID) [1] 45101 > sviData$svi$ID[1:5] [1] "1452670_at" "1422340_a_at" "1452114_s_at" "1422644_at" "1423359_at" > listall<-list("allprobes"=sviData$svi$ID) > endlist<-c("GO:0003674", "GO:0005575", "GO:0008150", "GO:0016209") > totalAnnotations<-ontoCompare(listall, probeType="mouse4302", method="none") > write.table(totalAnnotations, file="totalAnnotations.txt") > totalAnnotations2<-ontoCompare(listall, probeType="mouse4302", method="none", endnode=endlist) > write.table(totalAnnotations2, file="totalAnnotations_reduced.txt") When finding the total possible number of annotations for the top level GO terms (BP, MF, CC), I got different numbers for the two approaches, but I got the same numbers for "NotFound" and "antioxidant activity": from totalAnnotations.txt antioxidant activity 127 biological_process 2594 cellular_component 2365 molecular_function 2414 NotFound 11120 ...others from totalAnnotations_reduced.txt antioxidant activity 127 biological_process 28020 cellular_component 28509 molecular_function 30875 NotFound 11120 I was just wondering if anyone knew why this might happen since it affects the interpretation of a comparison I was going to do. These data appear to reflect the histogram output from ontoPlot (so I don't think its an R->txt->excel thing). Is the output with method="none" the total number of times all probes are annotated at the endnode or at a child of the end node? Does it have something to do with the "isa" values in EndNodeList() or my method of creating endlist? R v.2.5.0 goTools v1.8.0 Cheers, Dave --and thank you Dick for recommending topGO. I found what I needed through that package. _______________________________________________ Bioconductor mailing list Bioconductor at stat.math.ethz.ch https://stat.ethz.ch/mailman/listinfo/bioconductor Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor