Entering edit mode
Jeremy Leipzig
▴
70
@jeremy-leipzig-4924
Last seen 6.0 years ago
I think the Bioconductor workflows for variant studies are great, but
TabixFile or readVcf might need some additional sanity checks.
If a careless user (not me of course - I'm asking for a friend)
mistakenly
assumes a VCF is the "TabixFile" and the vcf.bgz.idx is the Tabix
index,
they will get this cryptic error when attempting to load certain
ranges
from a VCF file.
> tab<-TabixFile(vcfFile,paste(vcfFile,"bgz.tbi",sep="."))
> vcf<-readVcf(tab,genome="b37",param=some_param)
Error in lapply(names(vcf[[1]]), function(elt) { :
error in evaluating the argument 'X' in selecting a method for
function 'lapply': Error in vcf[[1]] : subscript out of bounds
This will lead to a lot of futile debugging with the assumption that
either
the ranges or the vcf itself are corrupt, since loading the vcf
without
ranges will not rely on Tabix.
The Tabix setup is especially prone to error since compressing the VCF
just
seems like an intermediate step and is often performed in the shell
instead
of within R. Something that tests whether a Tabix index is really
associated with a "TabixFile" would be helpful.
Thanks,
Jeremy
[[alternative HTML version deleted]]