Installing Bioconductor on Linux...

0

Entering edit mode

James Carman ▴ 150

@james-carman-4265

Last seen 10.1 years ago

I've got Bioconductor successfully installed on my linux workstation at work, but we need to install it on our server. I remember it being particularly difficult to make sure I had all the right packages installed in the O/S. Is there an easier way?

• 3.8k views

ADD COMMENT • link updated 14.0 years ago by Martin Morgan 25k • written 14.0 years ago by James Carman ▴ 150

0

Entering edit mode

Martin Morgan 25k

@martin-morgan-1513

Last seen 10 weeks ago

United States

Hi James -- On 10/03/2010 06:36 PM, James Carman wrote: > I've got Bioconductor successfully installed on my linux workstation > at work, but we need to install it on our server. I remember it being > particularly difficult to make sure I had all the right packages > installed in the O/S. Is there an easier way? Others will have different suggestions, but my two cents... (1) Probably the way to proceed is to install R and biocLite() packages as 'root' or similarly privileged account. Base and commonly used packages will be stored in a system-wide location. Individual users requiring additional packages will say biocLite("OtherPackage") and these will be installed in their own user directory as described in ?install.packages (which is what biocLite uses to install packages): If 'lib' is omitted or is of length one and is not a (group) writable directory, the code offers to create a personal library tree (the first element of 'Sys.getenv("R_LIBS_USER")') and install there. (2) 'Third party' dependencies need to be satisfied in a rational way, remembering (a) that R packages have complex dependencies with other R packages, and (b) R compiles C (and other) source code and so requires header files associated with third party libraries. Combining these, one can imagine biocLite("biomaRt") failing because XML fails because RCurl fails because the *devel* libcurl (devel required for the curl headers) is not installed. This will be indicated in the output of biocLite, but will require patient inspection of the output to see this. Part of the third party installation process may mean evaluating standard Linux commands (e.g., /sbin/ldconfig), setting environment variables (LD_LIBRARY_PATH ?), and under worst-case scenarios (Rmpi comes to mind) inspecting the R package configure.in / configure.ac script (by downloading the source package from CRAN or Bioconductor) to understand what the requirements are and how they are supposed to be satisfied. A final comment is that the next version of R is about to be released (scheduled October 15 for R, Oct 18 for Bioconductor), so if you're only going to get one opportunity to sit down with your system administrator you might want to delay for a couple of weeks. On the other hand it's a learning experience and much easier the second time. The Bioconductor team is interested in developing, over the next year or so, a more fool-proof way of distributing Bioconductor, so I encourage others to contribute their solutions and experiences. Martin > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor -- Computational Biology Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 Location: M1-B861 Telephone: 206 667-2793

ADD COMMENT • link 14.0 years ago Martin Morgan 25k

0

Entering edit mode

Replies in-line: On Sun, Oct 3, 2010 at 10:29 PM, Martin Morgan <mtmorgan at="" fhcrc.org=""> wrote: > > (1) Probably the way to proceed is to install R and biocLite() packages > as 'root' or similarly privileged account. Base and commonly used > packages will be stored in a system-wide location. Individual users > requiring additional packages will say biocLite("OtherPackage") and > these will be installed in their own user directory as described in > ?install.packages (which is what biocLite uses to install packages): > > ? ? If 'lib' is omitted or is of length > ? ? one and is not a (group) writable directory, the code offers to > ? ? create a personal library tree (the first element of > ? ? 'Sys.getenv("R_LIBS_USER")') and install there. > R is quite easy to install on my Redhat linux machine (I run Fedora, but our servers are RHEL4). I just did "yum install R" I believe. > (2) 'Third party' dependencies need to be satisfied in a rational way, > remembering (a) that R packages have complex dependencies with other R > packages, and (b) R compiles C (and other) source code and so requires > header files associated with third party libraries. Combining these, one > can imagine biocLite("biomaRt") failing because XML fails because RCurl > fails because the *devel* libcurl (devel required for the curl headers) > is not installed. This will be indicated in the output of biocLite, but > will require patient inspection of the output to see this. Part of the > third party installation process may mean evaluating standard Linux > commands (e.g., /sbin/ldconfig), setting environment variables > (LD_LIBRARY_PATH ?), and under worst-case scenarios (Rmpi comes to mind) > inspecting the R package configure.in / configure.ac script (by > downloading the source package from CRAN or Bioconductor) to understand > what the requirements are and how they are supposed to be satisfied. > This is my main concern, the libraries that Bioconductor needs to be installed in the operating system. So, there's no one-shot method of getting this done, eh? I basically need to see what it complains about and then figure out how to satisfy the dependency? > A final comment is that the next version of R is about to be released > (scheduled October 15 for R, Oct 18 for Bioconductor), so if you're only > going to get one opportunity to sit down with your system administrator > you might want to delay for a couple of weeks. On the other hand it's a > learning experience and much easier the second time. > Very cool! Updating R isn't really that tough. It should happen with a "yum update". Hopefully the Bioconductor dependencies won't change much so that the upgrade path for it is easy too. > The Bioconductor team is interested in developing, over the next year or > so, a more fool-proof way of distributing Bioconductor, so I encourage > others to contribute their solutions and experiences. > This would be a great for us linux geeks! :) It installed just fine on my windoze box at home. No worries there. Thank you for your insights! I'll keep plugging away at it.

ADD REPLY • link 14.0 years ago James Carman ▴ 150

0

Entering edit mode

Hello, On 10/04/2010 01:59 PM, James Carman wrote: > >> (2) 'Third party' dependencies need to be satisfied in a rational way, >> remembering (a) that R packages have complex dependencies with other R >> packages, and (b) R compiles C (and other) source code and so requires >> header files associated with third party libraries. Combining these, one >> can imagine biocLite("biomaRt") failing because XML fails because RCurl >> fails because the *devel* libcurl (devel required for the curl headers) >> is not installed. This will be indicated in the output of biocLite, but >> will require patient inspection of the output to see this. Part of the >> third party installation process may mean evaluating standard Linux >> commands (e.g., /sbin/ldconfig), setting environment variables >> (LD_LIBRARY_PATH ?), and under worst-case scenarios (Rmpi comes to mind) >> inspecting the R package configure.in / configure.ac script (by >> downloading the source package from CRAN or Bioconductor) to understand >> what the requirements are and how they are supposed to be satisfied. >> >> > This is my main concern, the libraries that Bioconductor needs to be > installed in the operating system. So, there's no one-shot method of > getting this done, eh? I basically need to see what it complains > about and then figure out how to satisfy the dependency? > the Debian/Ubuntu community is very aware of BioConductor. And there were several efforts, both manual and automated, to provide Debian/Ubuntu packages for it ... including all its C and other external libraries. The problem is that yet nobody came up with the necessary amount of spare time to maintain the BioConductor packages. And we always/usually want to use the latest versions which then are not too difficult to install manually. The advantage of Debian is the very clear package management, i.e. you get things out of the system and have some confidence that the installation on another system is truly the same. There is probably a consensus that the package should not become part of the regular Debian/Ubuntu distribution. What instead one could try, coming to mind while I write those lines, is a Debian/Ubuntu package that drags in the binaries dependencies that are already in Debian/Ubuntu and then executes the getBioC script in some suitable manner as a post-inst script. And one could have several such packages for the various arguments that getBioC could take. Would you like that? Would I like that? Sounds somewhat error prone and very much against how R packages today appear in Debian/Ubuntu and one basically loses the advantage of getting it all out again. Other ideas? Cheers, Steffen

ADD REPLY • link 14.0 years ago Steffen Moeller ▴ 90

0

Entering edit mode

On Sun, Oct 3, 2010 at 10:29 PM, Martin Morgan <mtmorgan at="" fhcrc.org=""> wrote: > Hi James -- > > On 10/03/2010 06:36 PM, James Carman wrote: >> I've got Bioconductor successfully installed on my linux workstation >> at work, but we need to install it on our server. ?I remember it being >> particularly difficult to make sure I had all the right packages >> installed in the O/S. ?Is there an easier way? > > Others will have different suggestions, but my two cents... > > (1) Probably the way to proceed is to install R and biocLite() packages > as 'root' or similarly privileged account. Base and commonly used > packages will be stored in a system-wide location. Individual users > requiring additional packages will say biocLite("OtherPackage") and > these will be installed in their own user directory as described in > ?install.packages (which is what biocLite uses to install packages): > > ? ? If 'lib' is omitted or is of length > ? ? one and is not a (group) writable directory, the code offers to > ? ? create a personal library tree (the first element of > ? ? 'Sys.getenv("R_LIBS_USER")') and install there. > > (2) 'Third party' dependencies need to be satisfied in a rational way, > remembering (a) that R packages have complex dependencies with other R > packages, and (b) R compiles C (and other) source code and so requires > header files associated with third party libraries. Combining these, one > can imagine biocLite("biomaRt") failing because XML fails because RCurl > fails because the *devel* libcurl (devel required for the curl headers) > is not installed. This will be indicated in the output of biocLite, but > will require patient inspection of the output to see this. Part of the > third party installation process may mean evaluating standard Linux > commands (e.g., /sbin/ldconfig), setting environment variables > (LD_LIBRARY_PATH ?), and under worst-case scenarios (Rmpi comes to mind) > inspecting the R package configure.in / configure.ac script (by > downloading the source package from CRAN or Bioconductor) to understand > what the requirements are and how they are supposed to be satisfied. > > A final comment is that the next version of R is about to be released > (scheduled October 15 for R, Oct 18 for Bioconductor), so if you're only > going to get one opportunity to sit down with your system administrator > you might want to delay for a couple of weeks. On the other hand it's a > learning experience and much easier the second time. > > The Bioconductor team is interested in developing, over the next year or > so, a more fool-proof way of distributing Bioconductor, so I encourage > others to contribute their solutions and experiences. I have a set of scripts for doing daily updates/compilation of R-devel that is run in a multi-user, multi-node setting here at Hopkins and seems to work great now that I have ironed out the various edge cases over the last months. I probably spend less than 5 minutes per week on doing this (unlike in the beginning where I did spend quite a bit more time). Of course, we have specific needs and use cases that may or may not be relevant for others. I have it on my todo list to make a document/webpage detailing what I have done. This will most likely be most useful for people who wants to have daily updates to R-devel with minimal maintenance. It does not address the problem of pulling external dependencies. There is no big secret, I just have some amount of useful shell code. But I certainly won't get around to annotating it the next couple of weeks. Kasper

ADD REPLY • link 14.0 years ago Kasper Daniel Hansen ★ 6.5k

Login before adding your answer.