seqVCF2GDS Error Converting VCF to GDS file
Dear Bioconductor:

I am a student of SISG Module 17 and used the code to convert my VCF file to GDS file. vcffile <- "data/72S1.vcf.gz" gdsfile <- "data/72S1.gds" seqVCF2GDS(vcffile, gdsfile, fmt.import="GT", storage.option="LZMA_RA", verbose=FALSE)

The VCF file is generated from WES of human, by using the Enrichment App. by Illumina. The VCF file contains a single patient.

I received the following error message.

Error in seqVCF2GDS(vcffile, gdsfile, fmt.import = "GT", storage.option = "LZMARA", : INFO ID 'GMAF' (Number=A) should have 0 value(s), but receives 1. FILE: C:\Users\winst\Documents\data\72S1.vcf.gz LINE: 160, COLUMN: 8, RefMinor;GMAF=C|0.04812;phyloP=-1.165;CSQT=1|DDX11L1|ENST00000456328|downstreamgenevariant,1|WASH7P|ENST00000438504|intronvariant&noncodingtranscriptvariant

Please help.

Winston Dunn

software error SeqArray seqVCF2GDS • 1.3k views
University of Washington

seqVCF2GDS is particular about VCF files conforming to the VCF standard. In this case it looks like the header line for "GMAF" has "Number=A", which means there should be one value per alternate allele. The file itself appears to have a row where there is no alternate allele (hence seqVCF2GDS is expecting 0 values), but there is a value provided for "GMAF". You might be able to solve this just by modifying the header, which you can do in the VCF file itself, or by saving a separate file with just the header and modifying that instead. You could then specify that alternate header in seqVCF2GDS:

hdr <- seqVCF_Header("revised_header.vcf")
gdsfile <- seqVCF2GDS(vcffile, gdsfile, header=hdr)
Thank you Stephanie! The Illumina Basespace provides 2 apps for making the VCF files: the "Enrichment" and "BWA Enrichment" cost exactly the same. When I generated the VCF files with BWA Enrichment it did not cause the problem.

zhengx ▴ 30
United States

You can directly modify the header in R:

hdr <- seqVCF_Header("data/72S1.vcf.gz")
hdr$info$Number[hdr$info$ID == "GMAF"] <- "."

gdsfile <- seqVCF2GDS(vcffile, gdsfile, header=hdr)

