Question: Effieicntly convert DNAStringSetList to a character vector
3.8 years ago by
dr10
dr10 wrote:

I'm using the VariantAnnotation​ package and I want to intersect two VCF files to find overlapping variants. What's the most efficient way to achieve this?

I can intersect the coordinates of the two VCFs with the GenomicRanges::findOverlaps function, but I still want to make sure that the ALT fields of the intersected coordinates match. As these are ​represented by DNAStringSetLists it's not clear to me how to efficiently achieve this.

written 3.8 years ago by dr10
Answer: Effieicntly convert DNAStringSetList to a character vector
3.8 years ago by
United States
Michael Lawrence11k wrote:

Not sure if it is the "best" way for you, but one way would be to coerce to a VRanges and use the ordinary match() functionality:

vr1 <- as(vcf1, "VRanges")
vr2 <- as(vcf2, "VRanges")
v1 %in% v2

etc

You might find VRanges generally useful for your use cases.