BioConductor Deployment at a large site, suggestions?
2
0
Entering edit mode
@brian-k-smith-1030
Last seen 9.6 years ago
I may be missing something simple here, if I am please tell me. I'd like to request an easy way to set up a local repository of BioConductor for a large site. What I would like is a simple tar archive of all the packages and dependencies I need for a default install of BioConductor, and a list of the installation order. For an individual the getBioC function is very easy, and great. However, as a site administrator I would prefer not to use it as: 1) Downloading the same stuff for 100 machines is a waste of bandwidth. 2) I want all machines to have the same version, until I can upgrade all of them. Doing getBioC on a machine I install a month from now may get different versions, and/or require a different version of R. 3) I could just run getBioC on all machines once a week to assure uniformity, but this is a waste of bandwidth and newer versions of BioConductor may require me to upgrade R before I am certain that other packages I have will work on the new R. Previously, in June, I was able to download all the "default" packages and install them with R CMD INSTALL, but this was difficult as default is defined differently by: 1) the web page 2) the getBioC script's comments 3) the getBioC script's output Eventually, I used #3 to determine what I needed and where to get it. I then made a script to install the packages from an NFS directory. Doing this was difficult since certain dependencies are not listed on the web for download, such as GO and KEGG, which do not seem to be on the BioConductor web pages, or on CRAN. Finding GO/KEEG was made more difficult since the web directory where getBioC gets the packages (http://www.bioconductor.org/data/metaData/) doesn't allow me to bring them up in a browser. So to get the tar.gz files I had to guess at the names from the output of getBioC. Somewhere I did see that getBioC can be set to use a local web repository, but for the reasons above this does not seem trivial to set up. I did search the mailing list, and the closest was someone with a makefile which used wget to get _everything_, not just a default install. Thus, can I request: * a link on the page, maybe 'For Administrators of many machines' on this page something saying which packages and dependencies are needed for the default install. * Either one large download with the packages, or at least a page or ftp site listing all packages available and dependencies needed, such as GO and KEGG. Once I have the default install adding additional packages is simple. This would make life easier for those of us with many machines that need to have the same software, and may need the exact same software reinstalled in the future. I'm attempting to upgrade to R 2.0, which should be relatively simple, but BioConductor is adding many frustrating hours of work that I shouldn't have to do every time want to refresh the package. Please tell me I'm missing some simple way to do a large scale deployment of this package! Thanks, Brian
GO GO • 879 views
ADD COMMENT
0
Entering edit mode
Jeff Gentry ★ 3.9k
@jeff-gentry-12
Last seen 9.6 years ago
> I may be missing something simple here, if I am please tell me. Your answer most likely would be solved by one of the sections in the FAQ, "Downloading All Packages From A Repository".
ADD COMMENT
0
Entering edit mode
@brian-k-smith-1030
Last seen 9.6 years ago
>Your answer most likely would be solved by one of the sections in the FAQ, >"Downloading All Packages From A Repository". Unfortunately, I had looked at the FAQ, and it didn't help much, probably due to my unfamiliarity with the package. The list gets me the directories for reposTools, most of which are permission denied when trying to browse and d/l the .tar.gz directly. It also doesn't tell me all the packages and dependencies needed for the "default" install of affy, cdna, and exprs. Thus every time I upgrade the package it's far more work than any other R package I've ever encountered, as I have to figure out what all I need to install in what order. Thus, between that and not wanting to hit bioconductor.org 100 times for 100 machines, I upgrade BioConductor rarely. Perhaps if I was an R guru this would be clear, but I'm a sys admin so I'm between the simple single user who just wants getBioC() and be done, and someone who wants their own repository of all the possible packages. I know just enough to install and basic test R and packages, though I would love to learn more R, but due to workload, I just don't have time. I would like an ordered list of packages and dependencies for the default install into a typical R install. Thus, I could then script this so when I need to kickstart install a machine the entire process gets done in 30 minutes and I have a machine that looked the same as it did before. Though the smaller ordered R CMD INSTALL list would be ideal, I think the reposTools and getBioC combined with an R BATCH may work. I'll look into it and probably contact you or Robert off list. I can't imagine that I'm the only sys admin with this issue. Perhaps when done I can write a HOWTO and post it back to the list. Thanks, Brian
ADD COMMENT

Login before adding your answer.

Traffic: 810 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6