I wrote an R function, filterVariantCalls()
, taking a bgzipped VCF as input, filtering variants with too many missing genotypes, and writing a new bgzipped VCF. Here is an example:
library(rutilstimflutre) # my personal R pkg, see below library(IRanges) # see https://support.bioconductor.org/p/73957/#73962 filterVariantCalls(vcf.file="input.vcf.gz", genome="test", out.file="output.vcf", yieldSize=10000, max.prop.gt.na=50)
It starts well:
FilterRules of length 1 names(1): filterGtProp starting filter filtering 10000 records
But then I receive the following error:
Error in extractROWS(x, eval(filter, x)) : error in evaluating the argument 'i' in selecting a method for function 'extractROWS': Error in rule(envir) : promise already under evaluation: recursive default argument reference or earlier problems?
Following R debugging good practices, I execute the traceback()
function, which gives:
8: extractROWS(x, eval(filter, x)) 7: .local(x, filter, ...) 6: subsetByFilter(vcfChunk, filters) 5: subsetByFilter(vcfChunk, filters) 4: .filter(file, genome, destination, verbose, filters, param, ...) 3: VariantAnnotation::filterVcf(file = tabix.file, genome = genome, destination = out.file, index = TRUE, verbose = (verbose > 0), filters = filters) 2: VariantAnnotation::filterVcf(file = tabix.file, genome = genome, destination = out.file, index = TRUE, verbose = (verbose > 0), filters = filters) 1: filterVariantCalls(vcf.file = vcf.file, genome = "test", out.file = out.file, yieldSize = 10000, max.prop.gt.na = 50)
Does the error come from the fact that stacks 5 and 6 correspond to the same function?
FYI the code of the filterVariantCalls()
function is in my custom R pkg on GitHub. Note that, as this is my personal R pkg, it has many functions doing different stuff. To avoid having it to depend on too many packages from different fields (genomics, genetics, statistics, etc), I explicitly use requireNamespace("foo")
when I need a function from the foo
pkg in one of my function, as advised by Hadley Wickham in his book here.
Here is my sessionInfo()
:
R version 3.2.2 (2015-08-14) Platform: x86_64-pc-linux-gnu (64-bit) Running under: CentOS release 6.6 (Final) locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=en_US.UTF-8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] stats4 parallel stats graphics grDevices utils datasets [8] methods base other attached packages: [1] IRanges_2.4.0 S4Vectors_0.8.0 BiocGenerics_0.16.0 [4] rutilstimflutre_0.16.0 loaded via a namespace (and not attached): [1] AnnotationDbi_1.32.0 XVector_0.10.0 [3] GenomicAlignments_1.6.0 GenomicRanges_1.22.0 [5] zlibbioc_1.16.0 BiocParallel_1.4.0 [7] BSgenome_1.38.0 GenomeInfoDb_1.6.0 [9] tools_3.2.2 SummarizedExperiment_1.0.0 [11] Biobase_2.30.0 data.table_1.9.6 [13] DBI_0.3.1 lambda.r_1.1.7 [15] futile.logger_1.4.1 rtracklayer_1.30.0 [17] futile.options_1.0.0 bitops_1.0-6 [19] RCurl_1.95-4.7 biomaRt_2.26.0 [21] RSQLite_1.0.0 compiler_3.2.2 [23] GenomicFeatures_1.22.0 Rsamtools_1.22.0 [25] Biostrings_2.38.0 XML_3.98-1.3 [27] VariantAnnotation_1.16.0 chron_2.3-47