Error in seqVCF2GDS FORMAT ID 'AD' should have 1 value(s), but receives 2.
1
0
Entering edit mode
cav3gh • 0
@cav3gh-15680
Last seen 3.7 years ago

I am trying to use SNPRelate to run some data analysis on a VCF file output from STACKS. I have gotten the file to read into R, but now when I am trying to cover to GDS file I keep getting the following error: Error in seqVCF2GDS(vcf.fn, "FullStudy.gds") : FORMAT ID 'AD' should have 1 value(s), but receives 2. FILE: /Users/allisavincent/Desktop/FullStudy_Current.vcf LINE: 12, COLUMN: 10, ./.:0:.,.

I have looked at other VCF files to check how AD should be formatted and all of them AD is formatted 6,4 or .,. if no information is available, so AD should have 2 values but for some reason seqVCF2GDS expects AD to have only 1 value, how do I get around this error?

vcf.fn <- "/Users/allisavincent/Desktop/FullStudyCurrent.vcf" seqarraytest1 <- seqVCF2GDS(vcf.fn, "FullStudy.gds")

seqVCF2GDS • 411 views
0
Entering edit mode
qliu7 • 0
@qliu7-13862
Last seen 4 days ago
Roswell Park Cancer Institute

Hi,

First, this function comes from package of SeqArray. Use ?SeqArray::seqVCF2GDS to see the documentations.

If the meta info included in FORMAT: AD wasn't really needed in your downstream analysis, you could specify the fmt.import = c("", "", ...) with values of the FORMAT ID names except for the AD so this value will not be read into the GDS file.

To fix your problem, could you print out the header file line with with the ID = AD? Something like this?

##FORMAT=<ID=GT,Number=1,Type=String,Description="Genotype">


Also print out a portion of your vcf file with LINE: 12, COLUMN: 10, ./.:0:.,.. It would also be helpful to print out the traceback() message and the sessionInfo() output.

In order to reproduce the error, you may want to produce a smaller vcf file (say sampling 100 variants from each chromosome) and save it on some web space so people could retrieve the file using your provided code.

Best, Qian