Error converting vcf to gds
2
0
Entering edit mode
annat22 • 0
@annat22-15313
Last seen 4.9 years ago

Hello,

I keep having an error converting vcf to gds with SNPRelate. I got the same error with two different ones, although it worked fine with other datasets previously. Pasted below is the code and error msg with session info. It basically says there are fewer columns than what expected. I re-ran everything up to this point to make sure the vcf is ok, and I can query it with bcftools with no issues, so it doesn't look like the vcf file is corrupted.

Would greatly appreciate any input.

Annat

> snpgdsVCF2GDS("Joint_allSNPjointMAF05.vcf.gz", "Joint_allSNPjointMAF05.gds", method="biallelic.only")

VCF Format --> SNP GDS Format

Method: exacting biallelic SNPs

Number of samples: 1894

Parsing "Joint_allSNPjointMAF05.vcf.gz" ...

Error:

FILE: Joint_allSNPjointMAF05.vcf.gz

    LINE: 1615045, COLUMN: 1605, 0/0:.:32,0:32:0:.:.:0,0,571

    fewer columns than what expected.

> traceback()

2: .Call(gnr_Parse_VCF4, vcf.fn[i], gfile$root, metidx, readLines,

       opfile, 1024L, ref.allele, ignore.chr.prefix, new.env(),

       verbose)

1: snpgdsVCF2GDS("Joint_allSNPjointMAF05.vcf.gz", "Joint_allSNPjointMAF05.gds",

       method = "biallelic.only")

>

> #snpgdsVCF2GDS("../../../../Raw/1000G/1000Genome_calls_all_chromosomes.vcf.gz","1000G_allSNPjointMAF05.gds", method="biallelic.only")

>       

> #snpgdsCombineGeno(c("Joint_allSNPjointMAF05.gds", "1000G_allSNPjointMAF05.gds"),

>  #       "Joint1000G_allSNPjointMAF05.gds")

> sessionInfo()

R version 3.1.1 (2014-07-10)

Platform: x86_64-unknown-linux-gnu (64-bit)

 

locale:

[1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              

[3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    

[5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   

[7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 

[9] LC_ADDRESS=C               LC_TELEPHONE=C            

[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

 

attached base packages:

[1] stats     graphics  grDevices utils     datasets  methods   base     

 

other attached packages:

[1] SNPRelate_1.0.1 gdsfmt_1.2.2   

snprelate vcf gds • 1.5k views
ADD COMMENT
0
Entering edit mode
@stephanie-m-gogarten-5121
Last seen 12 hours ago
University of Washington

Your version of R is nearly 4 years out of date. Please upgrade to current R and Bioconductor package versions and try again.

ADD COMMENT
0
Entering edit mode
zhengx ▴ 30
@zhengx-7950
Last seen 4.6 years ago
United States

You might try SeqArray::seqVCF2GDS(), which could provide you more error information.

 

ADD COMMENT

Login before adding your answer.

Traffic: 695 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6