mouse4302.db vs mouse 4302
2
0
Entering edit mode
@richard-friedman-513
Last seen 9.9 years ago
Dear bioconductor list, The mouse4302.db annotation package is listed as Version 2.02 of the annotation for mouse 4302 and was packaged Mon Oct 22 10:15:57. The mouse4302 annotation package is listed as Version 2.01 and was packaged Thu Oct 11 11:31:47 2007. Should I there use mouse4302.db as the more recent and authoritative of the 2 packages? Or is it simply one with different contents? I realize that the answer to this question may seem pretty obvious but I am wondering why mouse4302.db has a different name and not simply the same name and a different version number. So I thought I would ask to make sure. Thanks and best wishes, Rich ----------------------------------------------------------- Richard A. Friedman, PhD Associate Research Scientist, Biomedical Informatics Shared Resource Herbert Irving Comprehensive Cancer Center (HICCC) Lecturer, Department of Biomedical Informatics (DBMI) Educational Coordinator, Center for Computational Biology and Bioinformatics (C2B2)/ National Center for Multiscale Analysis of Genomic Networks (MAGNet) Box 95, Room 130BB or P&S 1-420C Columbia University Medical Center 630 W. 168th St. New York, NY 10032 (212)305-6901 (5-6901) (voice) friedman at cancercenter.columbia.edu http://cancercenter.columbia.edu/~friedman/ In Memoriam, Arthur C. Clarke
Annotation Cancer mouse4302 Annotation Cancer mouse4302 • 1.1k views
ADD COMMENT
0
Entering edit mode
@herve-pages-1542
Last seen 2 days ago
Seattle, WA, United States
Hi Richard, Richard Friedman wrote: > Dear bioconductor list, > > The mouse4302.db annotation package is listed as Version 2.02 of the > annotation > for mouse 4302 and was packaged Mon Oct 22 10:15:57. > > The mouse4302 annotation package is listed as Version 2.01 and was > packaged > Thu Oct 11 11:31:47 2007. The 2 packages contain the same data but in a different format. This difference is transparent to the end-user so you could use either one or the other but we recommend you always use the .db version whenever it's available (the .db packages are SQLite-based, which is the format we are moving all our annotations to, while the other packages are environment-based, which is the classic and soon to be deprecated format). > > Should I there use mouse4302.db as the more recent and authoritative > of the 2 packages? Yes, please use mouse4302.db. The other one (mouse4302) is provided for backward compatibility only and will be dropped in future versions of Bioconductor. Cheers, H. > Or is it simply one with different contents? > > I realize that the answer to this question may seem pretty obvious > but I am wondering > why mouse4302.db has a different name and not simply the same name > and a different version number. > So I thought I would ask to make sure. > > Thanks and best wishes, > Rich > ----------------------------------------------------------- > Richard A. Friedman, PhD > Associate Research Scientist, > Biomedical Informatics Shared Resource > Herbert Irving Comprehensive Cancer Center (HICCC) > Lecturer, > Department of Biomedical Informatics (DBMI) > Educational Coordinator, > Center for Computational Biology and Bioinformatics (C2B2)/ > National Center for Multiscale Analysis of Genomic Networks (MAGNet) > Box 95, Room 130BB or P&S 1-420C > Columbia University Medical Center > 630 W. 168th St. > New York, NY 10032 > (212)305-6901 (5-6901) (voice) > friedman at cancercenter.columbia.edu > http://cancercenter.columbia.edu/~friedman/ > > In Memoriam, > Arthur C. Clarke > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor >
ADD COMMENT
0
Entering edit mode
Marc Carlson ★ 7.2k
@marc-carlson-2264
Last seen 7.9 years ago
United States
Hi Richard, In general you want to use the .db version of these packages as it is the newer format. The older format is really only provided for people who have lots of old-style package dependencies that they still need to work out. In a couple of weeks, we should have a new version of these packages for you to explore. But the older format will be deprecated which means we are not planning to make it again after the upcoming release... Marc Richard Friedman wrote: > Dear bioconductor list, > > The mouse4302.db annotation package is listed as Version 2.02 of the > annotation > for mouse 4302 and was packaged Mon Oct 22 10:15:57. > > The mouse4302 annotation package is listed as Version 2.01 and was > packaged > Thu Oct 11 11:31:47 2007. > > Should I there use mouse4302.db as the more recent and authoritative > of the 2 packages? > Or is it simply one with different contents? > > I realize that the answer to this question may seem pretty obvious > but I am wondering > why mouse4302.db has a different name and not simply the same name > and a different version number. > So I thought I would ask to make sure. > > Thanks and best wishes, > Rich > ----------------------------------------------------------- > Richard A. Friedman, PhD > Associate Research Scientist, > Biomedical Informatics Shared Resource > Herbert Irving Comprehensive Cancer Center (HICCC) > Lecturer, > Department of Biomedical Informatics (DBMI) > Educational Coordinator, > Center for Computational Biology and Bioinformatics (C2B2)/ > National Center for Multiscale Analysis of Genomic Networks (MAGNet) > Box 95, Room 130BB or P&S 1-420C > Columbia University Medical Center > 630 W. 168th St. > New York, NY 10032 > (212)305-6901 (5-6901) (voice) > friedman at cancercenter.columbia.edu > http://cancercenter.columbia.edu/~friedman/ > > In Memoriam, > Arthur C. Clarke > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor > >
ADD COMMENT
0
Entering edit mode
An embedded and charset-unspecified text was scrubbed... Name: not available Url: https://stat.ethz.ch/pipermail/bioconductor/attachments/20080417/ 5c6f0ac5/attachment.pl
ADD REPLY
0
Entering edit mode
Hi Gabor, G?bor Cs?rdi wrote: > Dear users & developers, > > i'm sure that there were discussions about the pros and cons of switching to > .db packages before the actual decision > was made. Personally i don't really understand why it is not possible to > have the old style packages. If anyone > can point me out the pro's of having the .db packages, or just forward me to > the relevant arguments (assuming they > are public), i would be really grateful. It's not possible to have the old style packages because the SQLite packages are better, and it is entirely too much work to maintain duplicate annotation packages plus the code base to make both sets. > > For me, environment-based packages are much better, because they're faster, > and i have enough RAM to load > everything at the same time. Here is a little demonstration: > >> library(org.Hs.eg.db) >> >> egGO <- new.env(hash=TRUE) >> >> all <- mget( ls(org.Hs.egGO), org.Hs.egGO ) >> >> for (i in seq(all)) {assign(names(all)[i], all[[i]], envir=egGO) } >> >> sam <- sample(ls(org.Hs.egGO), 5000) >> system.time( tmp <- mget(sam, org.Hs.egGO) ) > user system elapsed > 0.808 0.001 0.811 >> system.time( tmp <- mget(sam, egGO) ) > user system elapsed > 0.022 0.000 0.023 > > You could argue that the slower query is still under one second, and it does > not matter, but it actually does matter if you have lots > of queries. For some of my scripts the difference is something like 1 day > agains 10 minutes. E.g. GO enrichment calculation > can be speeded up about 100 times if one uses the environment based packages > and tailors the original hyperGTest for GO > a bit. In this example the environment based packages are faster, but they are very limiting. For instance, they are only good for one-to-one or one-to-many relationships. If you want the reverse mapping of say, a GOTERM to Entrez Gene IDs, this is difficult to do programmatically (whereas this is simple with the SQLite packages, using revmap()). In addition, the env based packages don't allow you to do queries that use two envs at once (e.g., you can't get GO terms for a set of Affy IDs without doing sequential queries, which you can do using the SQLite packages). Additionally, just because you happen to have the compute power and RAM to use a large environment doesn't mean that the average BioC user does. If you want the SQLite-based packages to be close to the same speed as the environment packages, you probably won't be able to use the convenience functions that were written to allow naive users to get similar results with the SQLite packages. Things like get(), mget(), etc will probably always be slower than a directed SQL query. > library(GO) > sam <- sample(ls(GOTERM), 5000) > system.time(mget(sam, GOTERM)) user system elapsed 0.02 0.00 0.02 > library(GO.db) > sam <- sample(ls(GOTERM), 5000) > system.time( mget(sam, GOTERM)) user system elapsed 5.25 0.02 5.27 > samvec <- paste(paste("'", sam, "'", sep = ""), collapse = ",") > sql <- paste("SELECT term FROM go_term WHERE go_id IN (",samvec,");") > system.time(dbGetQuery(GO_dbconn(), sql)) user system elapsed 0.09 0.01 0.11 See the vignette in the devel version of AnnotationDbi for more information about these packages. Best, Jim > > It is interesting to see that in R, the direction was "storing everything in > memory" instead of keeping the data on the disk, > as many other statistical software packages do; here, however, the direction > is the opposite. You can also argue that these > data packages are getting bigger and bigger each year, but so do the size of > the RAM in the computers. For me it > is not obvious which curve is steeper. > > Finally, it might happen that the .db packages are just as fast as the old > ones, only i'm not using them the right way. > If this is the case, then i apologize, please show me the right way. > > Best Regards, > Gabor > > On Wed, Apr 16, 2008 at 11:36 PM, Marc Carlson <mcarlson at="" fhcrc.org=""> wrote: > >> Hi Richard, >> >> In general you want to use the .db version of these packages as it is >> the newer format. >> >> The older format is really only provided for people who have lots of >> old-style package dependencies that they still need to work out. In a >> couple of weeks, we should have a new version of these packages for you >> to explore. But the older format will be deprecated which means we are >> not planning to make it again after the upcoming release... >> >> >> Marc >> >> >> >> >> Richard Friedman wrote: >>> Dear bioconductor list, >>> >>> The mouse4302.db annotation package is listed as Version 2.02 of >> the >>> annotation >>> for mouse 4302 and was packaged Mon Oct 22 10:15:57. >>> >>> The mouse4302 annotation package is listed as Version 2.01 and was >>> packaged >>> Thu Oct 11 11:31:47 2007. >>> >>> Should I there use mouse4302.db as the more recent and >> authoritative >>> of the 2 packages? >>> Or is it simply one with different contents? >>> >>> I realize that the answer to this question may seem pretty obvious >>> but I am wondering >>> why mouse4302.db has a different name and not simply the same name >>> and a different version number. >>> So I thought I would ask to make sure. >>> >>> Thanks and best wishes, >>> Rich >>> ----------------------------------------------------------- >>> Richard A. Friedman, PhD >>> Associate Research Scientist, >>> Biomedical Informatics Shared Resource >>> Herbert Irving Comprehensive Cancer Center (HICCC) >>> Lecturer, >>> Department of Biomedical Informatics (DBMI) >>> Educational Coordinator, >>> Center for Computational Biology and Bioinformatics (C2B2)/ >>> National Center for Multiscale Analysis of Genomic Networks (MAGNet) >>> Box 95, Room 130BB or P&S 1-420C >>> Columbia University Medical Center >>> 630 W. 168th St. >>> New York, NY 10032 >>> (212)305-6901 (5-6901) (voice) >>> friedman at cancercenter.columbia.edu >>> http://cancercenter.columbia.edu/~friedman/<http: cancercenter.co="" lumbia.edu="" %7efriedman=""/> >>> >>> In Memoriam, >>> Arthur C. Clarke >>> >>> _______________________________________________ >>> Bioconductor mailing list >>> Bioconductor at stat.math.ethz.ch >>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>> Search the archives: >> http://news.gmane.org/gmane.science.biology.informatics.conductor >>> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at stat.math.ethz.ch >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: >> http://news.gmane.org/gmane.science.biology.informatics.conductor >> > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor -- James W. MacDonald, M.S. Biostatistician Affymetrix and cDNA Microarray Core University of Michigan Cancer Center 1500 E. Medical Center Drive 7410 CCGC Ann Arbor MI 48109 734-647-5623
ADD REPLY

Login before adding your answer.

Traffic: 881 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6