seqVCF2GDS Error Converting VCF to GDS file
Entering edit mode
Last seen 3.2 years ago

Dear Bioconductor:

I am a student of SISG Module 17 and used the code to convert my VCF file to GDS file. vcffile <- "data/72S1.vcf.gz" gdsfile <- "data/72S1.gds" seqVCF2GDS(vcffile, gdsfile, fmt.import="GT", storage.option="LZMA_RA", verbose=FALSE)

The VCF file is generated from WES of human, by using the Enrichment App. by Illumina. The VCF file contains a single patient.

I received the following error message.

Error in seqVCF2GDS(vcffile, gdsfile, fmt.import = "GT", storage.option = "LZMARA", : INFO ID 'GMAF' (Number=A) should have 0 value(s), but receives 1. FILE: C:\Users\winst\Documents\data\72S1.vcf.gz LINE: 160, COLUMN: 8, RefMinor;GMAF=C|0.04812;phyloP=-1.165;CSQT=1|DDX11L1|ENST00000456328|downstreamgenevariant,1|WASH7P|ENST00000438504|intronvariant&noncodingtranscriptvariant

Please help.

Winston Dunn

software error SeqArray seqVCF2GDS • 512 views
Entering edit mode
Last seen 12 days ago
University of Washington

seqVCF2GDS is particular about VCF files conforming to the VCF standard. In this case it looks like the header line for "GMAF" has "Number=A", which means there should be one value per alternate allele. The file itself appears to have a row where there is no alternate allele (hence seqVCF2GDS is expecting 0 values), but there is a value provided for "GMAF". You might be able to solve this just by modifying the header, which you can do in the VCF file itself, or by saving a separate file with just the header and modifying that instead. You could then specify that alternate header in seqVCF2GDS:

hdr <- seqVCF_Header("revised_header.vcf")
gdsfile <- seqVCF2GDS(vcffile, gdsfile, header=hdr)
Entering edit mode

Thank you Stephanie! The Illumina Basespace provides 2 apps for making the VCF files: the "Enrichment" and "BWA Enrichment" cost exactly the same. When I generated the VCF files with BWA Enrichment it did not cause the problem.

Entering edit mode
zhengx ▴ 30
Last seen 3.1 years ago
United States

You can directly modify the header in R:

hdr <- seqVCF_Header("data/72S1.vcf.gz")
hdr$info$Number[hdr$info$ID == "GMAF"] <- "."

gdsfile <- seqVCF2GDS(vcffile, gdsfile, header=hdr)

Login before adding your answer.

Traffic: 426 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6