Effieicntly convert DNAStringSetList to a character vector
1
1
Entering edit mode
dr ▴ 10
@dr-9473
Last seen 12 months ago
United States

I'm using the VariantAnnotation​ package and I want to intersect two VCF files to find overlapping variants. What's the most efficient way to achieve this?

I can intersect the coordinates of the two VCFs with the GenomicRanges::findOverlaps function, but I still want to make sure that the ALT fields of the intersected coordinates match. As these are ​represented by DNAStringSetLists it's not clear to me how to efficiently achieve this.

 

 

 

 

 

variantannotation • 950 views
ADD COMMENT
0
Entering edit mode
@michael-lawrence-3846
Last seen 2.4 years ago
United States

Not sure if it is the "best" way for you, but one way would be to coerce to a VRanges and use the ordinary match() functionality:

vr1 <- as(vcf1, "VRanges")
vr2 <- as(vcf2, "VRanges")
v1 %in% v2

etc

You might find VRanges generally useful for your use cases.

ADD COMMENT

Login before adding your answer.

Traffic: 943 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6