Hi,
Struck an issue with running 'readVcf' function (in VariantAnnotation package) if it is called after genotypeToSnpMatrix(), or importing snpStats package. It works fine if called first. Simple workaround is to import SNPstats before VariantAnnotation - but might be worth a bug report?
Examples below based on the vignette - first with bug:
#--------------------------------------------------------
# Session 1
# vcf1 loads
# vcf2 yeilds error:
library(VariantAnnotation)
# From the doco/sample files
fl <- system.file("extdata", "chr22.vcf.gz", package="VariantAnnotation")
vcf1 <- readVcf(fl, "hg19") # vcf1 is ok!
res1 <- genotypeToSnpMatrix(vcf1)
#messages:
#Attaching package: ‘Matrix’
#The following object is masked from ‘package:VariantAnnotation’:
# expand
#The following object is masked from ‘package:S4Vectors’:
# expand
#Warning message:
# In .local(x, ...) : non-single nucleotide variations are set to NA
vcf2 <- readVcf(fl, "hg19")
# Now same command causes error:
#Error: scanVcf: scanVcf: scanTabix: '.SigArgs' is shorter than '.SigLength' says it should be
#path: /my/path/R/x86_64-pc-linux-gnu-library/3.3/VariantAnnotation/extdata/chr22.vcf.gz
#index: /my/path/R/x86_64-pc-linux-gnu-library/3.3/VariantAnnotation/extdata/chr22.vcf.gz.tbi
#path: /my/path/R/x86_64-pc-linux-gnu-library/3.3/VariantAnnotation/extdata/chr22.vcf.gz
And second with workaround:
#--------------------------------------------------------
# Session 2 - works
# Load SNP stats first
library(snpStats)
library(VariantAnnotation)
fl <- system.file("extdata", "chr22.vcf.gz", package="VariantAnnotation")
vcf1 <- readVcf(fl, "hg19") # vcf1 is ok!
res1 <- genotypeToSnpMatrix(vcf1)
#Warning message:
# In .local(x, ...) : non-single nucleotide variations are set to NA
vcf2 <- readVcf(fl, "hg19") # vcf2 is ok.
Thanks,
Sarah.
# > sessionInfo()
# R version 3.3.0 (2016-05-03)
# Platform: x86_64-pc-linux-gnu (64-bit)
# Running under: Ubuntu 14.04.4 LTS
#
# locale:
# [1] LC_CTYPE=en_AU.UTF-8 LC_NUMERIC=C LC_TIME=en_AU.UTF-8 LC_COLLATE=en_AU.UTF-8
# [5] LC_MONETARY=en_AU.UTF-8 LC_MESSAGES=en_AU.UTF-8 LC_PAPER=en_AU.UTF-8 LC_NAME=C
# [9] LC_ADDRESS=C LC_TELEPHONE=C LC_MEASUREMENT=en_AU.UTF-8 LC_IDENTIFICATION=C
#
# attached base packages:
# [1] splines stats4 parallel stats graphics grDevices utils datasets methods base
#
# other attached packages:
# [1] snpStats_1.22.0 Matrix_1.1-2 survival_2.37-7 VariantAnnotation_1.18.7
# [5] Rsamtools_1.24.0 Biostrings_2.40.2 XVector_0.12.1 SummarizedExperiment_1.2.3
# [9] Biobase_2.20.0 GenomicRanges_1.24.3 GenomeInfoDb_1.8.7 IRanges_2.6.1
# [13] S4Vectors_0.10.3 BiocGenerics_0.18.0
#
# loaded via a namespace (and not attached):
# [1] AnnotationDbi_1.34.4 zlibbioc_1.18.0 GenomicAlignments_1.8.4 BiocParallel_1.6.6 lattice_0.20-24
# [6] BSgenome_1.40.1 tools_3.3.0 grid_3.3.0 DBI_0.4-1 rtracklayer_1.32.2
# [11] bitops_1.0-6 RCurl_1.95-4.1 biomaRt_2.18.0 RSQLite_0.11.2 GenomicFeatures_1.24.5
# [16] XML_3.98-1.1
I can't reproduce on several platforms. Some of your packages seem out of date. Use library(BiocInstaller); biocValid() and update to a consistent state. I don't see VariantAnnotation version in your sessionInfo() above.
I get a similar error when trying to read in the first example file. > vcf1 <- readVcf(fl, "hg19") Error: scanVcf: scanVcf: scanTabix: '.SigArgs' is shorter than '.SigLength' says it should be path: /home/joneill/R/x86_64-pc-linux-gnu-library/3.3/VariantAnnotation/extdata/chr22.vcf.gz index: /home/joneill/R/x86_64-pc-linux-gnu-library/3.3/VariantAnnotation/extdata/chr22.vcf.gz.tbi path: /home/joneill/R/x86_64-pc-linux-gnu-library/3.3/VariantAnnotation/extdata/chr22.vcf.gz
I have restarted R Studio several times. Is there a work around or a fix?