Extract structural variant and flanking sequence from VCF and fasta, in R
Entering edit mode
Last seen 3.9 years ago
Max Planck institute for plant breeding…

Hi all,

I am quite new to R/Rstudio, and trying to use it in combination with VariantAnnotation/Bioconductor to extract structural variant data and flanking sequence from available VCF and genome (fasta) files.

Quite recently, VCF's (VCFv4.1 source = sniffles) of over 100 tomato accessions were uploaded on the Solgenomics website. In combination with the SL4.0 genome fasta, I would like extract structural variant data and flanking sequences per tomato accession in a semi-automated method, with an output as followed.


Eventually, the goal would be using this data for possible marker design or similar activities.

I have tried various manuals, help pages and forums, however, since I am still a rookie when it comes to R, these are often quite dense in information that it is overwhelming. Therefore, I was hoping if someone could point me in a direction, or help me on my way with writing a code, and/or provide some explanation.

Thank you very much in advance!

- Willem

VariantAnnotation Bioconductor VCF Structural variants R • 1.0k views
Entering edit mode

Can you provide a link to the specific files that you are working with?


Login before adding your answer.

Traffic: 813 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6