question about ontoCompare() performance change
1
0
Entering edit mode
Scott Markel ▴ 50
@scott-markel-2964
Last seen 9.6 years ago
Canada
Just a quick FYI to anyone else using goTools' ontoCompare(). It looks like it's approximately another factor of 2 slower in BioConductor 2.5. User time has gone from 25 seconds (2.3) to 150 seconds (2.4) to 290 seconds (2.5). Don't know if this is package-specific or caused by changes in R. Scott -----Original Message----- From: Scott Markel Sent: Wednesday, 10 June 2009 5:15 PM To: Bioconductor at stat.math.ethz.ch Subject: question about ontoCompare() performance change I'm seeing a noticeable performance change in goTools' ontoCompare() from BioConductor version 2.3 to 2.4. With the same input data the user time reported by system.time() on my Windows XP machine has gone from 25 seconds to about 150 seconds. Times on a RHEL 5 machine are 30 seconds and 130 seconds. I checked the ontoCompare() help, the goTools documentation, the mailing list archives, and Google for terms like "ontoCompare goTools performance", and didn't find anything. I'm sure I'm missing something obvious, but I'd appreciate advice on how I should now be using ontoCompare() in Bioc 2.4. The script, BioC 2.3 output, BioC 2.4 output, and two sets of sessionInfo() follow. Scott ############################## Here's the R script, using the same inputs for both BioC 2.3 and 2.4. prop<-list() prop$probeIDs <- c("1007_s_at", "1053_at", "117_at", "121_at", "1255_g_at", "1294_at", "1316_at", "1320_at", "1405_i_at", "1405_i_at") prop$microarrayType <- "hgu133a" library("goTools") library("hgu133a.db") system.time(result <- ontoCompare( list(prop$probeIDs), probeType=as.character(prop$microarrayType), method="none", goType="MF")) ############################## The BioC 2.3 output is user system elapsed 23.31 0.22 25.70 > result binding catalytic activity chemoattractant activity enzyme regulator activity 1 10 4 2 1 molecular transducer activity structural molecule activity 1 5 1 transcription regulator activity NotFound 1 2 0 ############################## The BioC 2.4 output is user system elapsed 151.16 0.41 169.11 > result [,1] catalytic activity 4 binding 10 enzyme regulator activity 1 transcription regulator activity 2 chemoattractant activity 2 molecular transducer activity 5 ############################## > sessionInfo() R version 2.7.2 (2008-08-25) i386-pc-mingw32 locale: LC_COLLATE=English_United States.1252;LC_CTYPE=English_United States.1252;LC_MONETARY=English_United States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252 attached base packages: [1] tools stats graphics grDevices utils datasets methods [8] base other attached packages: [1] hgu133a_2.2.0 hgu133a.db_2.2.0 goTools_1.12.0 [4] GO_2.2.0 annotate_1.18.0 xtable_1.5-4 [7] AnnotationDbi_1.2.2 RSQLite_0.7-0 DBI_0.2-4 [10] Biobase_2.0.1 ############################## > sessionInfo() R version 2.9.0 (2009-04-17) i386-pc-mingw32 locale: LC_COLLATE=English_United States.1252;LC_CTYPE=English_United States.1252;LC_MONETARY=English_United States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] hgu133a.db_2.2.11 goTools_1.18.0 GO.db_2.2.11 [4] RSQLite_0.7-1 DBI_0.2-4 AnnotationDbi_1.6.0 [7] Biobase_2.4.1 ############################## Scott Markel, Ph.D. Principal Bioinformatics Architect email: smarkel at accelrys.com Accelrys (SciTegic R&D) mobile: +1 858 205 3653 10188 Telesis Court, Suite 100 voice: +1 858 799 5603 San Diego, CA 92121 fax: +1 858 799 5222 USA web: http://www.accelrys.com http://www.linkedin.com/in/smarkel Vice President, Board of Directors: International Society for Computational Biology Co-chair: ISCB Publications Committee Associate Editor: PLoS Computational Biology Editorial Board: Briefings in Bioinformatics
Transcription GO hgu133a goTools Transcription GO hgu133a goTools • 1.0k views
ADD COMMENT
0
Entering edit mode
Seth Falcon ★ 7.4k
@seth-falcon-992
Last seen 9.6 years ago
Hi Scott, Thanks for the reminder and providing a reproducible example. We will take a look and see if we can understand and provide a fix for the slow down. + seth On 10/28/09 5:23 PM, Scott Markel wrote: > Just a quick FYI to anyone else using goTools' ontoCompare(). > > It looks like it's approximately another factor of 2 slower in > BioConductor 2.5. User time has gone from 25 seconds (2.3) to > 150 seconds (2.4) to 290 seconds (2.5). Don't know if this is > package-specific or caused by changes in R. > > Scott > > > -----Original Message----- > From: Scott Markel > Sent: Wednesday, 10 June 2009 5:15 PM > To: Bioconductor at stat.math.ethz.ch > Subject: question about ontoCompare() performance change > > I'm seeing a noticeable performance change in goTools' ontoCompare() > from BioConductor version 2.3 to 2.4. With the same input data the > user time reported by system.time() on my Windows XP machine has gone > from 25 seconds to about 150 seconds. Times on a RHEL 5 machine are > 30 seconds and 130 seconds. > > I checked the ontoCompare() help, the goTools documentation, the mailing > list archives, and Google for terms like "ontoCompare goTools performance", > and didn't find anything. > > I'm sure I'm missing something obvious, but I'd appreciate advice on > how I should now be using ontoCompare() in Bioc 2.4. > > The script, BioC 2.3 output, BioC 2.4 output, and two sets of > sessionInfo() follow. > > Scott > > ############################## > Here's the R script, using the same inputs for both BioC 2.3 and 2.4. > > prop<-list() > prop$probeIDs<- c("1007_s_at", "1053_at", "117_at", "121_at", > "1255_g_at", "1294_at", "1316_at", "1320_at", "1405_i_at", "1405_i_at") > prop$microarrayType<- "hgu133a" > > library("goTools") > library("hgu133a.db") > > system.time(result<- ontoCompare( list(prop$probeIDs), > probeType=as.character(prop$microarrayType), method="none", goType="MF")) > ############################## > The BioC 2.3 output is > > user system elapsed > 23.31 0.22 25.70 > >> result > binding catalytic activity chemoattractant activity enzyme regulator > activity > 1 10 4 2 > 1 > molecular transducer activity structural molecule activity > 1 5 1 > transcription regulator activity NotFound > 1 2 0 > ############################## > The BioC 2.4 output is > > user system elapsed > 151.16 0.41 169.11 > >> result > [,1] > catalytic activity 4 > binding 10 > enzyme regulator activity 1 > transcription regulator activity 2 > chemoattractant activity 2 > molecular transducer activity 5 > > ############################## >> sessionInfo() > R version 2.7.2 (2008-08-25) > i386-pc-mingw32 > > locale: > LC_COLLATE=English_United States.1252;LC_CTYPE=English_United > States.1252;LC_MONETARY=English_United > States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252 > > attached base packages: > [1] tools stats graphics grDevices utils datasets methods > [8] base > > other attached packages: > [1] hgu133a_2.2.0 hgu133a.db_2.2.0 goTools_1.12.0 > [4] GO_2.2.0 annotate_1.18.0 xtable_1.5-4 > [7] AnnotationDbi_1.2.2 RSQLite_0.7-0 DBI_0.2-4 > [10] Biobase_2.0.1 > ############################## >> sessionInfo() > R version 2.9.0 (2009-04-17) > i386-pc-mingw32 > > locale: > LC_COLLATE=English_United States.1252;LC_CTYPE=English_United > States.1252;LC_MONETARY=English_United > States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252 > > attached base packages: > [1] stats graphics grDevices utils datasets methods base > > other attached packages: > [1] hgu133a.db_2.2.11 goTools_1.18.0 GO.db_2.2.11 > [4] RSQLite_0.7-1 DBI_0.2-4 AnnotationDbi_1.6.0 > [7] Biobase_2.4.1 > ############################## > > Scott Markel, Ph.D. > Principal Bioinformatics Architect email: smarkel at accelrys.com > Accelrys (SciTegic R&D) mobile: +1 858 205 3653 > 10188 Telesis Court, Suite 100 voice: +1 858 799 5603 > San Diego, CA 92121 fax: +1 858 799 5222 > USA web: http://www.accelrys.com > > http://www.linkedin.com/in/smarkel > Vice President, Board of Directors: > International Society for Computational Biology > Co-chair: ISCB Publications Committee > Associate Editor: PLoS Computational Biology > Editorial Board: Briefings in Bioinformatics > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
ADD COMMENT

Login before adding your answer.

Traffic: 943 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6