ensemblVEP support for Ensembl API version 78
1
0
Entering edit mode
@moiz-bootwalla-5215
Last seen 9.1 years ago
United States

Hi,

I'm trying to use ensemblVEP with version 78 of the Ensembl API. It seems that the package doesn't support version 78 yet. Are there any plans to include support for version 78 soon? It seems that a new option that was introduced in version 78 of the API is causing ensemblVEP() to fail. Here is a reproducible example:

fl <- system.file("extdata", "gl_chr1.vcf", package="VariantAnnotation")
gr <- ensemblVEP(fl)
"@PICK_ORDER" is not exported by the Bio::EnsEMBL::Variation::Utils::VEP module
Can't continue after import errors at /home/mbootwalla/repos/ensembl-tools/scripts/variant_effect_predictor/variant_effect_predictor.pl line 50.
BEGIN failed--compilation aborted at /home/mbootwalla/repos/ensembl-tools/scripts/variant_effect_predictor/variant_effect_predictor.pl line 67.
Error in .io_check_exists(path(con)) : file(s) do not exist:
  '/tmp/RtmpYsB9y8/file2c6244d9bfdd'

Enter a frame number, or 0 to exit   

 1: ensemblVEP(fl)
 2: ensemblVEP(fl)
 3: fun(dest, genome = "")
 4: fun(dest, genome = "")
 5: readVcf(x, "", param = param)
 6: readVcf(x, "", param = param)
 7: .local(file, genome, param, ...)
 8: .readVcf(file, genome, param, row.names = row.names, ...)
 9: .scanVcfToVCF(scanVcf(file, param = param, row.names = row.names), file, genome, param)
10: scanVcfHeader(file)
11: scanVcfHeader(file)
12: scanBcfHeader(file, ...)
13: scanBcfHeader(file, ...)
14: Map(function(file, mode) {
    bf <- open(BcfFile(file, character(0), ...))
    on.exit(close(bf))
    scanBc
15: standardGeneric("Map")
16: eval(.dotsCall, env)
17: eval(.dotsCall, env)
18: eval(expr, envir, enclos)
19: .Method(..., f = f)
20: mapply(FUN = f, ..., SIMPLIFY = FALSE)
21: (function (file, mode) 
{
    bf <- open(BcfFile(file, character(0), ...))
    on.exit(close(bf))
    scanBcfH
22: open(BcfFile(file, character(0), ...))
23: open.BcfFile(BcfFile(file, character(0), ...))
24: .io_check_exists(path(con))

Output of sessionInfo()

sessionInfo()
R version 3.1.2 (2014-10-31)
Platform: x86_64-suse-linux-gnu (64-bit)

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C               LC_TIME=en_US.UTF-8       
 [4] LC_COLLATE=en_US.UTF-8     LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                  LC_ADDRESS=C              
[10] LC_TELEPHONE=C             LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats4    parallel  stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] ensemblVEP_1.6.0         VariantAnnotation_1.12.4 Rsamtools_1.18.2         Biostrings_2.34.0       
 [5] XVector_0.6.0            GenomicRanges_1.18.3     GenomeInfoDb_1.2.3       IRanges_2.0.0           
 [9] S4Vectors_0.4.0          BiocGenerics_0.12.1      BiocInstaller_1.16.1    

loaded via a namespace (and not attached):
 [1] AnnotationDbi_1.28.1    base64enc_0.1-2         BatchJobs_1.5           BBmisc_1.8             
 [5] Biobase_2.26.0          BiocParallel_1.0.0      biomaRt_2.22.0          bitops_1.0-6           
 [9] brew_1.0-6              BSgenome_1.34.0         checkmate_1.5.0         codetools_0.2-9        
[13] DBI_0.3.1               digest_0.6.4            fail_1.2                foreach_1.4.2          
[17] GenomicAlignments_1.2.1 GenomicFeatures_1.18.2  iterators_1.0.7         RCurl_1.95-4.4         
[21] RSQLite_1.0.0           rtracklayer_1.26.2      sendmailR_1.2-1         stringr_0.6.2          
[25] tools_3.1.2             XML_3.98-1.1            zlibbioc_1.12.0

 

Thanks,

Moiz

ensemblvep • 2.1k views
ADD COMMENT
1
Entering edit mode
@valerie-obenchain-4275
Last seen 2.3 years ago
United States

Hi Moiz,

I can't reproduce the error. Both version 77 and 78 of the perl script work with the current ensemblVEP packages in release and devel.

I think you may have updated your script version to 78 but the API is still 77. Did you run INSTALL.pl after downloading?

I have not updated ensemblVEP to allow specification of the new flags in version 78 (flag_pick, flag_pick_allelle, pick_order, tsl). If you had tried to specify one of these in the param the call would fail (that error I can reproduce!). I'll add these options in the next couple of days and post back when it's done.

Let me know if you still have problems after running INSTALL.pl.

Valerie

script version 78:

>> fl <- system.file("extdata", "gl_chr1.vcf", package="VariantAnnotation")
>> gr <- ensemblVEP(fl)
> UNIVERSAL->import is deprecated and will be removed in a future perl at /home/vobencha/apps/ensembl-tools-release-78/scripts/variant_effect_predictor/Bio/Tree/TreeFunctionsI.pm line 94.
> 2014-12-21 08:46:26 - Starting...
> 2014-12-21 08:46:26 - Detected format of input file as vcf
> 2014-12-21 08:46:26 - Read 3 variants into buffer
> 2014-12-21 08:46:26 - Reading transcript data from cache and/or database
> [=================================================================================================================================================================================]  [ 100% ]
> 2014-12-21 08:46:47 - Retrieved 9 transcripts (0 mem, 0 cached, 10 DB, 1 duplicates)
> 2014-12-21 08:46:47 - Analyzing chromosome 1
> 2014-12-21 08:46:47 - Analyzing variants
> [=================================================================================================================================================================================]  [ 100% ]
> 2014-12-21 08:46:47 - Calculating consequences
> [=================================================================================================================================================================================]  [ 100% ]
> 2014-12-21 08:46:47 - Processed 3 total variants (0 vars/sec, 0 vars/sec total)
> 2014-12-21 08:46:47 - Wrote stats summary to /tmp/RtmpFWT478/file12be2f700878_summary.html
> 2014-12-21 08:46:47 - Finished!

script version 77:

>> fl <- system.file("extdata", "gl_chr1.vcf", package="VariantAnnotation")
>> gr <- ensemblVEP(fl)
> UNIVERSAL->import is deprecated and will be removed in a future perl at /home/vobencha/apps/ensembl-tools-release-77/scripts/variant_effect_predictor/Bio/Tree/TreeFunctionsI.pm line 94.
> 2014-12-21 08:55:50 - Starting...
> 2014-12-21 08:55:50 - Detected format of input file as vcf
> 2014-12-21 08:55:50 - Read 3 variants into buffer
> 2014-12-21 08:55:50 - Reading transcript data from cache and/or database
> [=================================================================================================================================================================================]  [ 100% ]
> 2014-12-21 08:56:11 - Retrieved 9 transcripts (0 mem, 0 cached, 10 DB, 1 duplicates)
> 2014-12-21 08:56:11 - Analyzing chromosome 1
> 2014-12-21 08:56:11 - Analyzing variants
> [=================================================================================================================================================================================]  [ 100% ]
> 2014-12-21 08:56:11 - Calculating consequences
> [=================================================================================================================================================================================]  [ 100% ]
> 2014-12-21 08:56:11 - Processed 3 total variants (0 vars/sec, 0 vars/sec total)
> 2014-12-21 08:56:11 - Wrote stats summary to /tmp/Rtmp4SiRO0/file142f1255ea42_summary.html
> 2014-12-21 08:56:11 - Finished!

 

> sessionInfo()
> R version 3.1.2 Patched (2014-11-23 r67046)
> Platform: x86_64-unknown-linux-gnu (64-bit)
>
> locale:
> [1] C
>
> attached base packages:
> [1] stats4    parallel  stats     graphics  grDevices utils     datasets
> [8] methods   base    
>
> other attached packages:
>  [1] ensemblVEP_1.6.0         VariantAnnotation_1.12.7 Rsamtools_1.18.2       
>  [4] Biostrings_2.34.0        XVector_0.6.0            GenomicRanges_1.18.3   
>  [7] GenomeInfoDb_1.2.3       IRanges_2.0.1            S4Vectors_0.4.0        
> [10] BiocGenerics_0.12.1    
>
> loaded via a namespace (and not attached):
>  [1] AnnotationDbi_1.28.1    BBmisc_1.8              BSgenome_1.34.0       
>  [4] BatchJobs_1.5           Biobase_2.26.0          BiocParallel_1.0.0    
>  [7] DBI_0.3.1               GenomicAlignments_1.2.1 GenomicFeatures_1.18.2
> [10] RCurl_1.95-4.3          RSQLite_1.0.0           XML_3.98-1.1          
> [13] base64enc_0.1-2         biomaRt_2.22.0          bitops_1.0-6          
> [16] brew_1.0-6              checkmate_1.5.0         codetools_0.2-9       
> [19] digest_0.6.4            fail_1.2                foreach_1.4.2         
> [22] iterators_1.0.7         rtracklayer_1.26.2      sendmailR_1.2-1       
> [25] stringr_0.6.2           tools_3.1.2             zlibbioc_1.12.0 
ADD COMMENT
1
Entering edit mode

ensemblVEP is now updated in release (1.6.1) and devel (1.7.1) to include the  Ensembl API version 78 flags. These versions should be available via biocLite() Tuesday ~noon PST.

 

Valerie 

ADD REPLY
0
Entering edit mode

Thanks Valerie for adding in support so quickly. I will try updating the packages later today. 

It's strange but I restarted my machine and re-ran the above code and everything works fine now. Not sure what I was doing wrong.

I just want to run my setup by you to be sure that what I did in setting up VEP isn't the issue.

I installed the entire Ensembl-API (core, variation, functional, compara modules). Next I installed ensembl-tools which contains VEP. Then I ran the INSTALL.pl script to setup the cache and get the fasta files and then I indexed them using tabix via the provided convert_cache.pl script. I'm working with the GRCh37 cache since my data is still using the hg19 assembly. Next I added the VEP_PATH environmental variable to my .Renviron file so that ensemblVEP can find the variant_effect_predictor.pl script.

Now that the above code was working I tried running VEP in offline mode with the cache and some other options specified and I ran into a new error. It looks like a VEP specific issue but I ran VEP directly with the exact same options and it works. I also tried running it after restarting my RStudio session and also tried running it directly in an R session on the terminal. Not sure what the issue is. The code and the output is as follows:

> param <- VEPParam()
> basic(param) <- list(everything=TRUE)
> cache(param) <- list(cache=TRUE, dir="$HOME/software/ensembl-VEP", offline=TRUE)
> identifier(param) <- list(xref_refseq=TRUE)
> colocatedVariants(param) <- list(check_existing=T)
> fl <- system.file("extdata", "gl_chr1.vcf", package="VariantAnnotation")
> gr <- ensemblVEP(fl, param)
2014-12-23 11:15:39 - Read existing cache info
2014-12-23 11:15:39 - Auto-detected FASTA file in cache directory
2014-12-23 11:15:39 - Checking/creating FASTA index
2014-12-23 11:15:39 - Starting...
2014-12-23 11:15:39 - Detected format of input file as vcf
2014-12-23 11:15:39 - Read 3 variants into buffer
2014-12-23 11:15:39 - Checking for existing variations
[>                                              ]    [ 0% ]
Can't call method "db" on an undefined value at /home/mbootwalla/repos/ensembl-variation/modules/Bio/EnsEMBL/Variation/Utils/VEP.pm line 5888, <GEN1> line 39.
Warning message:
In is.na(ulst) : is.na() applied to non-(list or vector) of type 'NULL'
> sessionInfo()
R version 3.1.2 (2014-10-31)
Platform: x86_64-suse-linux-gnu (64-bit)

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C               LC_TIME=en_US.UTF-8       
 [4] LC_COLLATE=en_US.UTF-8     LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                  LC_ADDRESS=C              
[10] LC_TELEPHONE=C             LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats4    parallel  stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] ensemblVEP_1.6.0         VariantAnnotation_1.12.4 Rsamtools_1.18.2         Biostrings_2.34.0       
 [5] XVector_0.6.0            GenomicRanges_1.18.3     GenomeInfoDb_1.2.3       IRanges_2.0.0           
 [9] S4Vectors_0.4.0          BiocGenerics_0.12.1      BiocInstaller_1.16.1    

loaded via a namespace (and not attached):
 [1] AnnotationDbi_1.28.1    base64enc_0.1-2         BatchJobs_1.5           BBmisc_1.8             
 [5] Biobase_2.26.0          BiocParallel_1.0.0      biomaRt_2.22.0          bitops_1.0-6           
 [9] brew_1.0-6              BSgenome_1.34.0         checkmate_1.5.0         codetools_0.2-9        
[13] DBI_0.3.1               digest_0.6.4            fail_1.2                foreach_1.4.2          
[17] GenomicAlignments_1.2.1 GenomicFeatures_1.18.2  iterators_1.0.7         RCurl_1.95-4.4         
[21] RSQLite_1.0.0           rtracklayer_1.26.2      sendmailR_1.2-1         stringr_0.6.2          
[25] tools_3.1.2             XML_3.98-1.1            zlibbioc_1.12.0     

Thanks,

Moiz

 

 

ADD REPLY

Login before adding your answer.

Traffic: 648 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6