Search
Question: Error with R package motifbreakR while trying to read a vcf file with snps.from.file function
0
gravatar for svlachavas
4 weeks ago by
svlachavas610
Greece/Athens/National Hellenic Research Foundation
svlachavas610 wrote:

Dear Community,

briefly in a current project i tried to implement the R package motifbreakR, in order to identify putative SNPs from somatic variant calling, that could cause an "important" disruption in TF-binding sites. The first part of my code chunk, is for reading the SNPs included in the relative vcf file i tried as an initial example:

library(motifbreakR)
library(BSgenome)
vcf_files <- list.files(pattern = ".vcf", full.names = TRUE)

list.files()
[1] "LY14_01.recode.snpEff_annotated.filtered.snps.recode.dbnsfp_anno (1).vcf"

snps.vcf <- snps.from.file(file=vcf_files,
+ search.genome =BSgenome.Hsapiens.UCSC.hg38,
+ format = "vcf")
Error in elementLengths(info(snps)[, "VT"]) :
  could not find function "elementLengths"
In addition: Warning message:
In .bcfHeaderAsSimpleList(header) :
  duplicate keys in header will be forced to unique rownames

Any ideas or suggestions about this error ?

sessionInfo()
R version 3.4.4 (2018-03-15)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows >= 8 x64 (build 9200)

Matrix products: default

locale:
[1] LC_COLLATE=Greek_Greece.1253  LC_CTYPE=Greek_Greece.1253   
[3] LC_MONETARY=Greek_Greece.1253 LC_NUMERIC=C                 
[5] LC_TIME=Greek_Greece.1253    

attached base packages:
 [1] stats4    parallel  grid      stats     graphics  grDevices utils    
 [8] datasets  methods   base     

other attached packages:
 [1] genoset_1.34.0                    BiocInstaller_1.28.0             
 [3] VariantAnnotation_1.24.5          Rsamtools_1.30.0                 
 [5] SummarizedExperiment_1.8.1        DelayedArray_0.4.1               
 [7] matrixStats_0.53.1                Biobase_2.38.0                   
 [9] BSgenome.Hsapiens.UCSC.hg38_1.4.1 BSgenome_1.46.0                  
[11] rtracklayer_1.38.3                GenomicRanges_1.30.3             
[13] GenomeInfoDb_1.14.0               motifbreakR_1.8.0                
[15] MotifDb_1.20.0                    Biostrings_2.46.0                
[17] XVector_0.18.0                    IRanges_2.12.0                   
[19] S4Vectors_0.16.0                  BiocGenerics_0.24.0    

Best,

Efstathios

 

ADD COMMENTlink modified 29 days ago by Valerie Obenchain ♦♦ 6.5k • written 4 weeks ago by svlachavas610
1
gravatar for Valerie Obenchain
29 days ago by
Valerie Obenchain ♦♦ 6.5k
United States
Valerie Obenchain ♦♦ 6.5k wrote:

Hi,

elementLengths() was deprecated some time ago (2016 I think) and replaced with elementNROWS(). The generic is in the S4Vectors package.

The current Bioconductor release and devel use R 3.5. S4Vectors is at version 0.18.3 in release and 0.19.14, you can see the most current versions on the build reports:

https://www.bioconductor.org/checkResults/3.7/bioc-LATEST/

https://www.bioconductor.org/checkResults/3.8/bioc-LATEST/

Most likely you have a mix of old and newer packages in your install (check with BiocInstaller::biocValid()). I'd recommend updating R and your Bioconductor packages.

Valerie

ADD COMMENTlink modified 29 days ago • written 29 days ago by Valerie Obenchain ♦♦ 6.5k

Dear Valerie, thank you for your valuable answers and comments-

my current S4Vectors version is S4Vectors_0.16.0

Thus, i will perhaps try to update the package-you think this is feasible without updating R ? for example from a development version because I'm currently performing some extra analysis, and i would not like to make some major changes-except if there no other way, and i should upload to the latest R-

Best,

Efstathios

 

ADD REPLYlink written 25 days ago by svlachavas610

Dear Valerie, sorry to return again but this time, a new error appeared using the same exact code:

snps.vcf <- snps.from.file(file=vcf_files,search.genome =BSgenome.Hsapiens.UCSC.hg38,format = "vcf")
Error in info(snps) : could not find function "info"
In addition: Warning message:
In .bcfHeaderAsSimpleList(header) :
  duplicate keys in header will be forced to unique rownames

sessionInfo()
R version 3.5.0 (2018-04-23)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows >= 8 x64 (build 9200)

Matrix products: default

locale:
[1] LC_COLLATE=Greek_Greece.1253  LC_CTYPE=Greek_Greece.1253   
[3] LC_MONETARY=Greek_Greece.1253 LC_NUMERIC=C                 
[5] LC_TIME=Greek_Greece.1253    

attached base packages:
 [1] stats4    parallel  grid      stats     graphics  grDevices utils     datasets 
 [9] methods   base     

other attached packages:
 [1] BSgenome.Hsapiens.UCSC.hg38_1.4.1 BSgenome_1.48.0                  
 [3] rtracklayer_1.40.3                GenomicRanges_1.32.3             
 [5] GenomeInfoDb_1.16.0               motifbreakR_1.10.0               
 [7] MotifDb_1.22.0                    Biostrings_2.48.0                
 [9] XVector_0.20.0                    IRanges_2.14.10                  
[11] S4Vectors_0.18.3                  BiocGenerics_0.26.0              
[13] BiocInstaller_1.30.0             

Any ideas about this new error ?

ADD REPLYlink written 14 days ago by svlachavas610
1

Hi,

I think you've identified a bug in motifbreakR. The maintainer should import info() from VariantAnnotation (currently they only import readVcf()).

The reason you see this error but the example on ?snps.from.file runs smoothly is because you are hitting a different conditional in the function. Your use case has format="vcf" and the man page has format="bed".

You can see the source by typing the name of the function at the command line:

> snps.from.file
function (file = NULL, dbSNP = NULL, search.genome = NULL, format = "bed")
{
    if (format == "vcf") {
        if (!inherits(search.genome, "BSgenome")) {
            stop(paste0(search.genome, " is not a BSgenome object.\n",
                "Run availible.genomes() and choose the appropriate BSgenome object"))
        }
        genome.name <- genome(search.genome)[[1]]
        snps <- readVcf(file, genome.name)
        snps <- snps[lengths(info(snps)[, "VT"]) == 1, ]
        snps <- rowRanges(snps)[unlist(info(snps)[, "VT"] ==
            "SNP"), ]
        ALTS <- unlist(snps$ALT)

...

I would suggest filing a bug/issue on the github repo: https://github.com/Simon-Coetzee/motifBreakR.

Valerie

ADD REPLYlink written 14 days ago by Valerie Obenchain ♦♦ 6.5k

Dear Valerie,

thank you for your consideration on this matter-i have created also a post in the github account, but still i have not received an answer, as also from previous emails to the authors i did not get an answer, so i do not know if things are active-any custom solution you would support ? for overcoming this bug ?

Best,

Efstathios-Iason

ADD REPLYlink written 13 days ago by svlachavas610
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.2.0
Traffic: 248 users visited in the last hour