Hello,
I keep having an error converting vcf to gds with SNPRelate. I got the same error with two different ones, although it worked fine with other datasets previously. Pasted below is the code and error msg with session info. It basically says there are fewer columns than what expected. I re-ran everything up to this point to make sure the vcf is ok, and I can query it with bcftools with no issues, so it doesn't look like the vcf file is corrupted.
Would greatly appreciate any input.
Annat
> snpgdsVCF2GDS("Joint_allSNPjointMAF05.vcf.gz", "Joint_allSNPjointMAF05.gds", method="biallelic.only")
VCF Format --> SNP GDS Format
Method: exacting biallelic SNPs
Number of samples: 1894
Parsing "Joint_allSNPjointMAF05.vcf.gz" ...
Error:
FILE: Joint_allSNPjointMAF05.vcf.gz
LINE: 1615045, COLUMN: 1605, 0/0:.:32,0:32:0:.:.:0,0,571
fewer columns than what expected.
> traceback()
2: .Call(gnr_Parse_VCF4, vcf.fn[i], gfile$root, metidx, readLines,
opfile, 1024L, ref.allele, ignore.chr.prefix, new.env(),
verbose)
1: snpgdsVCF2GDS("Joint_allSNPjointMAF05.vcf.gz", "Joint_allSNPjointMAF05.gds",
method = "biallelic.only")
>
> #snpgdsVCF2GDS("../../../../Raw/1000G/1000Genome_calls_all_chromosomes.vcf.gz","1000G_allSNPjointMAF05.gds", method="biallelic.only")
>
> #snpgdsCombineGeno(c("Joint_allSNPjointMAF05.gds", "1000G_allSNPjointMAF05.gds"),
> # "Joint1000G_allSNPjointMAF05.gds")
> sessionInfo()
R version 3.1.1 (2014-07-10)
Platform: x86_64-unknown-linux-gnu (64-bit)
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] SNPRelate_1.0.1 gdsfmt_1.2.2