parsing a vcf file using VariantAnnotation
1
0
Entering edit mode
p_das • 0
@p_das-8653
Last seen 7.0 years ago
United States

Hi

I have a vcf file for a gene. I am able to read it into R using readVcf() from the VariantAnnotation Bioconductor package.

I am interested to find how many variants are SNPs, how many are INDELS, and how many are SV.

This info is present in the info-VT field...

I am not able to subset the vcf based on the info-VT field.

Help appreciated..

 

annotation • 1.4k views
ADD COMMENT
1
Entering edit mode
@valerie-obenchain-4275
Last seen 2.3 years ago
United States

Hi,

Extracting data and subsetting the VCF object is documented on ?readVcf and in the vignette, browseVignettes("VariantAnnotation"). There is also an example of filtering a vcf file on ?filterVcf. You may also look at the ?isSNV man page.

Extracting data is done with info(), geno(), alt(), ref(), etc. Using a sample file from the package,

> fl <- system.file("extdata", "ex2.vcf", package="VariantAnnotation")
> vcf <- readVcf(fl, "hg19")
> info(vcf)$DB
[1]  TRUE FALSE  TRUE FALSE FALSE

Subset 'vcf' on 'DB':

> dim(vcf) # original 
[1] 5 3
> dim(vcf[info(vcf)$DB])  # subset
[1] 2 3

Valerie

 

ADD COMMENT
0
Entering edit mode

Thank you Valerie!

ADD REPLY

Login before adding your answer.

Traffic: 943 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6