Search
Question: Error when running "locateVariants" function in "VariantAnnotation" package
1
gravatar for yduan004
11 weeks ago by
yduan00420
Riverside
yduan00420 wrote:

I met an error when running "locateVariants" function in "VariantAnnotation" package. Before running "locateVariants" function, I need to first libarary "tidyverse", but when I library "tidyverse" package before "VariantAnnotation" and run "locateVariants" function, an error happens ("Error in as.vector(x) : no method for coercing this S4 class to a vector"). But it works when library "VariantAnnotation" and run "locateVariants" function first. Could anyone please explain to me why it happens and how could I do to make it work when library "tidyverse" first? Here is the code that can reproduce the error.

The "rd_test.rds" file can be downloaded here https://drive.google.com/open?id=0BwtKdDxl2iyQaUc5ZW9rM2FPVnc, the "Homo_sapiens.GRCh38.90.sqlite" file can be downloaded here https://drive.google.com/open?id=0BwtKdDxl2iyQQTZJZlVJS2NzWjA

## Start a new R session

rd_test <- readRDS("rd_test.rds")

library(GenomicFeatures)

txdb <- loadDb("Homo_sapiens.GRCh38.90.sqlite")

library(VariantAnnotation)

allvar <- locateVariants(rd_test, txdb, AllVariants())

library(tidyverse)

allvar <- locateVariants(rd_test, txdb, AllVariants())

## "locateVariants" function works when library "VariantAnnotation" and run "locateVariants" function then library "tidyverse"

sessionInfo()

# R version 3.4.1 (2017-06-30)
# Platform: x86_64-pc-linux-gnu (64-bit)
# Running under: Ubuntu 16.04.1 LTS
#
# Matrix products: default
# BLAS: /usr/lib/atlas-base/atlas/libblas.so.3.0
# LAPACK: /usr/lib/atlas-base/atlas/liblapack.so.3.0
#
# locale:
#   [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C               LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
# [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8    LC_PAPER=en_US.UTF-8       LC_NAME=C                 
# [9] LC_ADDRESS=C               LC_TELEPHONE=C             LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
#
# attached base packages:
#   [1] parallel  stats4    stats     graphics  utils     datasets  grDevices methods   base     
#
# other attached packages:
#   [1] dplyr_0.7.2                purrr_0.2.3                readr_1.1.1                tidyr_0.7.0               
# [5] tibble_1.3.4               ggplot2_2.2.1              tidyverse_1.1.1            VariantAnnotation_1.20.3  
# [9] Rsamtools_1.26.2           Biostrings_2.42.1          XVector_0.14.1             SummarizedExperiment_1.4.0
# [13] GenomicFeatures_1.26.4     AnnotationDbi_1.36.2       Biobase_2.34.0             GenomicRanges_1.26.4      
# [17] GenomeInfoDb_1.10.3        IRanges_2.8.2              S4Vectors_0.12.2           BiocGenerics_0.20.0       
# [21] setwidth_1.0-4             colorout_1.1-2            
#
# loaded via a namespace (and not attached):
#   [1] Rcpp_0.12.12             lubridate_1.6.0          lattice_0.20-35          assertthat_0.2.0        
# [5] digest_0.6.12            psych_1.7.5              cellranger_1.1.0         R6_2.2.2                
# [9] plyr_1.8.4               RSQLite_2.0              httr_1.3.1               zlibbioc_1.20.0         
# [13] rlang_0.1.2              readxl_1.0.0             lazyeval_0.2.0           blob_1.1.0              
# [17] Matrix_1.2-11            BiocParallel_1.8.2       stringr_1.2.0            foreign_0.8-69          
# [21] RCurl_1.95-4.8           bit_1.1-12               biomaRt_2.30.0           munsell_0.4.3           
# [25] broom_0.4.2              modelr_0.1.1             compiler_3.4.1           rtracklayer_1.34.2      
# [29] pkgconfig_2.0.1          mnormt_1.5-5             XML_3.98-1.9             GenomicAlignments_1.10.1
# [33] bitops_1.0-6             grid_3.4.1               nlme_3.1-131             jsonlite_1.5            
# [37] gtable_0.2.0             DBI_0.7                  magrittr_1.5             scales_0.4.1            
# [41] stringi_1.1.5            reshape2_1.4.2           bindrcpp_0.2             xml2_1.1.1              
# [45] tools_3.4.1              forcats_0.2.0            bit64_0.9-7              BSgenome_1.42.0         
# [49] glue_1.1.1               hms_0.3                  yaml_2.1.14              colorspace_1.3-2        
# [53] rvest_0.3.2              memoise_1.1.0            bindr_0.1                haven_1.1.0  

 

## Start a new R session

rd_test <- readRDS("rd_test.rds")

library(GenomicFeatures)

txdb <- loadDb("Homo_sapiens.GRCh38.90.sqlite")

library(tidyverse)

library(VariantAnnotation)

allvar <- locateVariants(rd_test, txdb, AllVariants())
# 'select()' returned 1:1 mapping between keys and columns
# Error in as.vector(x) : no method for coercing this S4 class to a vector

traceback()

sessionInfo()
# R version 3.4.1 (2017-06-30)
# Platform: x86_64-pc-linux-gnu (64-bit)
# Running under: Ubuntu 16.04.1 LTS
#
# Matrix products: default
# BLAS: /usr/lib/atlas-base/atlas/libblas.so.3.0
# LAPACK: /usr/lib/atlas-base/atlas/liblapack.so.3.0
#
# locale:
#   [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C               LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
# [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8    LC_PAPER=en_US.UTF-8       LC_NAME=C                 
# [9] LC_ADDRESS=C               LC_TELEPHONE=C             LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
#
# attached base packages:
#   [1] parallel  stats4    stats     graphics  utils     datasets  grDevices methods   base     
#
# other attached packages:
#   [1] VariantAnnotation_1.20.3   Rsamtools_1.26.2           Biostrings_2.42.1          XVector_0.14.1            
# [5] SummarizedExperiment_1.4.0 dplyr_0.7.2                purrr_0.2.3                readr_1.1.1               
# [9] tidyr_0.7.0                tibble_1.3.4               ggplot2_2.2.1              tidyverse_1.1.1           
# [13] GenomicFeatures_1.26.4     AnnotationDbi_1.36.2       Biobase_2.34.0             GenomicRanges_1.26.4      
# [17] GenomeInfoDb_1.10.3        IRanges_2.8.2              S4Vectors_0.12.2           BiocGenerics_0.20.0       
# [21] setwidth_1.0-4             colorout_1.1-2            
#
# loaded via a namespace (and not attached):
#   [1] Rcpp_0.12.12             lubridate_1.6.0          lattice_0.20-35          assertthat_0.2.0        
# [5] digest_0.6.12            psych_1.7.5              cellranger_1.1.0         R6_2.2.2                
# [9] plyr_1.8.4               RSQLite_2.0              httr_1.3.1               zlibbioc_1.20.0         
# [13] rlang_0.1.2              readxl_1.0.0             lazyeval_0.2.0           blob_1.1.0              
# [17] Matrix_1.2-11            BiocParallel_1.8.2       stringr_1.2.0            foreign_0.8-69          
# [21] RCurl_1.95-4.8           bit_1.1-12               biomaRt_2.30.0           munsell_0.4.3           
# [25] broom_0.4.2              compiler_3.4.1           modelr_0.1.1             rtracklayer_1.34.2      
# [29] pkgconfig_2.0.1          mnormt_1.5-5             XML_3.98-1.9             GenomicAlignments_1.10.1
# [33] bitops_1.0-6             grid_3.4.1               nlme_3.1-131             jsonlite_1.5            
# [37] gtable_0.2.0             DBI_0.7                  magrittr_1.5             scales_0.4.1            
# [41] stringi_1.1.5            reshape2_1.4.2           bindrcpp_0.2             xml2_1.1.1              
# [45] tools_3.4.1              forcats_0.2.0            bit64_0.9-7              BSgenome_1.42.0         
# [49] glue_1.1.1               hms_0.3                  yaml_2.1.14              colorspace_1.3-2        
# [53] rvest_0.3.2              memoise_1.1.0            bindr_0.1                haven_1.1.0 

## "locateVariants" function doesn't work when library "tidyverse" first and then library "VariantAnnotation" and run "locateVariants" function

 

I need to library "tidyverse" before running "locateVariants" function, I am looking forward to some one can help me with it. Thanks!

Yuzhu

 

 

ADD COMMENTlink modified 10 weeks ago by Hervé Pagès ♦♦ 13k • written 11 weeks ago by yduan00420
4
gravatar for Hervé Pagès
10 weeks ago by
Hervé Pagès ♦♦ 13k
United States
Hervé Pagès ♦♦ 13k wrote:

Hi,

After more investigation on this, it seems to be an S4 problem only. What's causing it is that both the lubridate (which tidyverse depends on) and S4Vectors packages define a "union" S4 generic. So after loading the 2 packages, 2 "union" S4 generics end up in the cache. Because of some issues in the methods package, this confuses getGeneric(), standardGeneric(), callNextMethod(), and maybe a few other things.

I committed a workaround in S4Vectors that addresses the original issue you reported. It's just a workaround though. Hopefully the methods package will be fixed in the next release of R to handle homonymic generics properly. The workaround is in S4Vectors 0.14.4 (release) and 0.15.7 (devel). Both versions should become available via biocLite() in the next 48 hours or so.

Cheers,

H.

ADD COMMENTlink written 10 weeks ago by Hervé Pagès ♦♦ 13k

Hi Herve, 

Thanks for your solution! I am looking forward to using your new release of S4Vectors package!

Cheers,

Yuzhu

ADD REPLYlink written 10 weeks ago by yduan00420
1
gravatar for Pariksheet Nanda
10 weeks ago by
University of Connecticut
Pariksheet Nanda70 wrote:

There are several select() functions to subset by columns. One is defined by tidyverse, but in your case select() you need the ones defined by the AnnotationDbi and biomaRt packages. You can check which select() function is first recognized in your environment by calling it without arguments:

## Before importing tidyverse
> select
standardGeneric for "select" defined from package "AnnotationDbi"

function (x, keys, columns, keytype, ...) 
standardGeneric("select")
<environment: 0x7c1c000>
Methods may be defined for arguments: x
Use  showMethods("select")  for currently available ones.
>

When you load the tidyverse package you will see a list of conflicts. Here many tidyverse functions can take preference over Bioconductor:

## Messages from importing tidyverse
Conflicts with tidy packages ---------------------------------------------------
collapse(): dplyr, Biostrings, IRanges
combine():  dplyr, Biobase, BiocGenerics
compact():  purrr, XVector
count():    dplyr, matrixStats
desc():     dplyr, IRanges
expand():   tidyr, VariantAnnotation, S4Vectors
filter():   dplyr, stats
first():    dplyr, S4Vectors
lag():      dplyr, stats
Position(): ggplot2, BiocGenerics, base
reduce():   purrr, GenomicRanges, IRanges
rename():   dplyr, S4Vectors
select():   dplyr, VariantAnnotation, AnnotationDbi
slice():    dplyr, XVector, IRanges
>

It's not just select(); many other Bioconductor functions are also masked in the list above: collapse(), combine(), compact(), desc(), expand(), first(), reduce(), rename(), and slice().

After importing tidyverse we can see that the dplyr select() takes preference by again calling it without arguments:

## After importing tidyverse
> select
function (.data, ...) 
{
    UseMethod("select")
}
<environment: namespace:dplyr>
>

You can work around this by importing tidyverse first, then all your bioconductor packages:

## Start a new R session
library(tidyverse)
library(GenomicFeatures)
library(VariantAnnotation)

rd_test <- readRDS("rd_test.rds")
txdb <- loadDb("Homo_sapiens.GRCh38.90.sqlite")
allvar <- locateVariants(rd_test, txdb, AllVariants())

Now when you load the GenomicFeatures you will see:

The following object is masked from ‘package:dplyr’:

    select

If you ever need to use the dplyr select or some of the other masked tidyverse functions, you should call them explicitly with the double semi-colon notation:

dplyr::select()
ADD COMMENTlink written 10 weeks ago by Pariksheet Nanda70

Hi Nanda, thanks for answering my question. I ran this trunk of code

## Start a new R session
library(tidyverse)
library(GenomicFeatures)
library(VariantAnnotation)

rd_test <- readRDS("rd_test.rds")
txdb <- loadDb("Homo_sapiens.GRCh38.90.sqlite")
allvar <- locateVariants(rd_test, txdb, AllVariants())

But the error still exists (Error in as.vector(x) : no method for coercing this S4 class to a vector). 

> showMethods("select")
Function: select (package AnnotationDbi)
x="ChipDb"
x="GODb"
x="Inparanoid8Db"
x="InparanoidDb"
x="Mart"
x="OrgDb"
x="PolyPhenDb"
x="PROVEANDb"
x="ReactomeDb"
x="SIFTDb"
x="TxDb"

The select mehtod is from package AnnotationDbi, I think it is not the reason of this problem. Could you please look at it again?

Thanks!

Yuzhu

ADD REPLYlink written 10 weeks ago by yduan00420
1

I believe this complicated, but due to presence of both tidyverse and S4Vectors. I don't think it's easy to avoid, other than not loading tidyverse. Here's a simpler illustration

> library(tidyverse)
> library(VariantAnnotation)
> example(locateVariants)
...

lctVrn>   loc_all <- locateVariants(vcf, txdb, AllVariants())
'select()' returned many:1 mapping between keys and columns
Error in as.vector(x) : no method for coercing this S4 class to a vector

 This can be debugged and simplified with

> traceback()
35: as.vector(x)
34: unique(c(as.vector(x), as.vector(y)))
33: .nextMethod(as(x, "Hits"), as(y, "Hits"))
32: eval(call, callEnv)
31: eval(call, callEnv)
30: callNextMethod(as(x, "Hits"), as(y, "Hits"))
29: .class1(object)
28: as(callNextMethod(as(x, "Hits"), as(y, "Hits")), class(x))
27: .local(x, y, ...)
26: union(fo_start, fo_end)

fo_start and fo_end are 'Hits' objects, and

> union(Hits(), Hits())
Error in as.vector(x) : no method for coercing this S4 class to a vector

> selectMethod("union", c("Hits", "Hits"))
Method Definition (Class "derivedDefaultMethod"):

function (x, y) 
unique(c(as.vector(x), as.vector(y)))
<bytecode: 0x487b140>
<environment: namespace:base>

Signatures:
        x      y     
target  "Hits" "Hits"
defined "ANY"  "ANY" 

Whereas in a session without tidyverse we have

> selectMethod("union", c("Hits", "Hits"))
Method Definition:

function (x, y, ...) 
{
    .local <- function (x, y) 
    as(callNextMethod(as(x, "Hits"), as(y, "Hits")), class(x))
    .local(x, y, ...)
}
<environment: namespace:S4Vectors>

Signatures:
        x      y     
target  "Hits" "Hits"
defined "Hits" "Hits"

The root of the problem is that dplyr 'promotes' union to an S3 generic

> dplyr:::union
function (x, y, ...) 
UseMethod("union")
<environment: namespace:dplyr>

That permanently changes method dispatch, even when S4Vectors correctly defined methods.

ADD REPLYlink written 10 weeks ago by Martin Morgan ♦♦ 20k

Hi Martin,

Thanks for identifying the problem! I hope this bug can be fixed in the next release of R!

Cheers,

Yuzhu

ADD REPLYlink written 10 weeks ago by yduan00420
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.2.0
Traffic: 117 users visited in the last hour