I'm installing the bioconductor wrapper to ensembl-vep in a docker container, using conda. Here is my dockerfile:
FROM centos:centos7.2.1511
RUN echo " - Installing development tools ..." \
&& yum install -y yum-plugin-ovl \
&& yum groupinstall -y "Development Tools"
# Install miniconda to /miniconda
RUN curl -LO http://repo.continuum.io/miniconda/Miniconda3-4.5.12-Linux-x86_64.sh
RUN bash Miniconda3-4.5.12-Linux-x86_64.sh -p /miniconda -b
RUN rm Miniconda3-4.5.12-Linux-x86_64.sh
ENV PATH=/miniconda/bin:${PATH}
RUN conda update -n base -c defaults conda
RUN conda config --add channels bioconda
RUN conda config --add channels conda-forge
RUN conda install ensembl-vep=94.5 bioconductor-ensemblvep r-base openssl=1.0
ENV PATH "/miniconda/share/ensembl-vep-94.5-0:$PATH"
RUN localedef -i en_US -f UTF-8 en_US.UTF8
Calling vep from the command line works. I can tell that the vep path is indeed appended to my $PATH. However, when I start R and load ensembl-vep library(ensemblVEP)
, I get the following output indicating that R has not found the vep script (4th line from the bottom):
> library(ensemblVEP)
Loading required package: BiocGenerics
Loading required package: parallel
Attaching package: 'BiocGenerics'
The following objects are masked from 'package:parallel':
clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
clusterExport, clusterMap, parApply, parCapply, parLapply,
parLapplyLB, parRapply, parSapply, parSapplyLB
The following objects are masked from 'package:stats':
IQR, mad, sd, var, xtabs
The following objects are masked from 'package:base':
Filter, Find, Map, Position, Reduce, anyDuplicated, append,
as.data.frame, basename, cbind, colMeans, colSums, colnames,
dirname, do.call, duplicated, eval, evalq, get, grep, grepl,
intersect, is.unsorted, lapply, lengths, mapply, match, mget,
order, paste, pmax, pmax.int, pmin, pmin.int, rank, rbind,
rowMeans, rowSums, rownames, sapply, setdiff, sort, table, tapply,
union, unique, unsplit, which, which.max, which.min
Loading required package: GenomicRanges
Loading required package: stats4
Loading required package: S4Vectors
Attaching package: 'S4Vectors'
The following object is masked from 'package:base':
expand.grid
Loading required package: IRanges
Loading required package: GenomeInfoDb
Loading required package: VariantAnnotation
Loading required package: SummarizedExperiment
Loading required package: Biobase
Welcome to Bioconductor
Vignettes contain introductory material; view with
'browseVignettes()'. To cite Bioconductor, see
'citation("Biobase")', and for packages 'citation("pkgname")'.
Loading required package: DelayedArray
Loading required package: matrixStats
Attaching package: 'matrixStats'
The following objects are masked from 'package:Biobase':
anyMissing, rowMedians
Loading required package: BiocParallel
Attaching package: 'DelayedArray'
The following objects are masked from 'package:matrixStats':
colMaxs, colMins, colRanges, rowMaxs, rowMins, rowRanges
The following objects are masked from 'package:base':
aperm, apply
Loading required package: Rsamtools
Loading required package: Biostrings
Loading required package: XVector
Attaching package: 'Biostrings'
The following object is masked from 'package:DelayedArray':
type
The following object is masked from 'package:base':
strsplit
Attaching package: 'VariantAnnotation'
The following object is masked from 'package:base':
tabulate
variant_effect_predictor.pl or vep script not found. Ensembl VEP is not installed in your path.
Attaching package: 'ensemblVEP'
The following object is masked from 'package:Biobase':
cache
This is a bit weird, as when I call Sys.getenv("PATH")
, R does in fact seem to have my vep path:
PATH /miniconda/share/ensembl-vep-94.5-0:/miniconda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
Any hints on what I should do to make the ensemblVEP R function pick up the vep function as installed on the host?
Hi,
Is your goal here to use
ensemblVEP
as a package inside a Docker image? Or is it to build your own Docker image which uses bioconda to installensembleVEP
?If it is the first one, then I suggest you use a different base image, (this is a temporary suggestion)
Please note that the image is not stable yet, bioconductor/bioconductorfull:RELEASE3_8 but if your goal is just to use it for one package, you should be fine.
If your goal is the second one, building your own image through miniconda installed
ensemblVEP
, can you try to just usebioconductor-ensemlvep
without installing dependencies? The conda recipe should take care of it IMO.Best,
Nitesh
Hi Nitseh, thanks for the suggestions.
My goal is to use the
ensemblVEP
package inside of a docker container. I don't care how the container was built or what install system I use.This package is unusual in that it relies on the
ensembl-vep
(note spelling difference) software to be installed on the host. And this software is unusual in that it has very specific perl dependencies. Hence my use of conda, which does in fact install the software correctly, although I have not yet found how to make it available to theensembleVEP
R package.Neither of your suggestions above installs
ensembl-vep
at all (I confirmed this).