2.7 years ago by
United States
Use ScanVcfParam() with readVcf() to selectively import your data into R, or filterVcf() to create a new VCF file with an appropriate subset. The primary source of documentation are the vignettes and man pages of relevant functions, available from within R in the usual way for from the package landing page.
VCF files are of course just text files, but they are highly structured; grep is ok for some basic manipulations (filterVcf does this for the 'prefilters') but other computations involve unpacking the data more completely.
Maybe a little philosophical but there is tremendous value to semantically 'rich' data that one loses with dplyr; a short compare and contrast is for instance at slides 14 - 16 of these slides. This value is compounded the more you use Bioconductor -- for a one-off it seems like overkill, but for daily use you find yourself spending less time worrying about data representation and more time addressing the informatic, statistical, and biological questions that motivate your research.