Question: scanVcf: FORMAT 'GT' not found
0
gravatar for seth redmond
7.0 years ago by
seth redmond70
seth redmond70 wrote:
I keep running into an error in my VCF files but can't seem to pinpoint where the problem is. The file has a number of missing genotypes but nothing that should be causing any problems, I don't think, and it passes vcf-validator without any problem. Completely unremarkable code and head of the file below: Has anyone encountered this before? Or has any suggestions as to what might be the issue? thanks -s > filename<-"tmpvcf.vcf.gz" > vcftab <- TabixFile(filename, index = paste(filename, "tbi", sep=".")); > vcfScan <- scanVcf(filename) trace: scanVcf(filename) trace: scanVcf(con) Error: scanVcf: record 1 field 1 FORMAT 'GT' not found path: tmpvcf.vcf.gz bash-3.2$ vcf-validator tmpvcf.vcf.gz The header tag 'reference' not present. (Not required but highly recommended.) The header tag 'contig' not present for CHROM=2R. (Not required but highly recommended.) The header tag 'contig' not present for CHROM=3L. (Not required but highly recommended.) ##fileformat=VCFv4.1 ##samtoolsVersion=0.1.18 (r982:295) ##INFO=<id=dp,number=1,type=integer,description="raw read="" depth"=""> ##INFO=<id=dp4,number=4,type=integer,description="# high-quality="" ref-="" forward="" bases,="" ref-reverse,="" alt-forward="" and="" alt-reverse="" bases"=""> ##FORMAT=<id=dp4,number=4,type=integer,description="# high-quality="" ref-forward="" bases,="" ref-reverse,="" alt-forward="" and="" alt-reverse="" bases"=""> ##INFO=<id=mq,number=1,type=integer,description="root-mean-square mapping="" quality="" of="" covering="" reads"=""> ##INFO=<id=fq,number=1,type=float,description="phred probability="" of="" all="" samples="" being="" the="" same"=""> ##INFO=<id=af1,number=1,type=float,description="max-likelihood estimate="" of="" the="" first="" alt="" allele="" frequency="" (assuming="" hwe)"=""> ##INFO=<id=ac1,number=1,type=float,description="max-likelihood estimate="" of="" the="" first="" alt="" allele="" count="" (no="" hwe="" assumption)"=""> ##INFO=<id=g3,number=3,type=float,description="ml estimate="" of="" genotype="" frequencies"=""> ##INFO=<id=hwe,number=1,type=float,description="chi^2 based="" hwe="" test="" p-value="" based="" on="" g3"=""> ##INFO=<id=clr,number=1,type=integer,description="log ratio="" of="" genotype="" likelihoods="" with="" and="" without="" the="" constraint"=""> ##INFO=<id=ugt,number=1,type=string,description="the most="" probable="" unconstrained="" genotype="" configuration="" in="" the="" trio"=""> ##INFO=<id=cgt,number=1,type=string,description="the most="" probable="" constrained="" genotype="" configuration="" in="" the="" trio"=""> ##INFO=<id=pv4,number=4,type=float,description="p-values for="" strand="" bias,="" baseq="" bias,="" mapq="" bias="" and="" tail="" distance="" bias"=""> ##INFO=<id=pc2,number=2,type=integer,description="phred probability="" of="" the="" nonref="" allele="" frequency="" in="" group1="" samples="" being="" larger="" (,smaller)="" than="" in="" group2."=""> ##INFO=<id=pchi2,number=1,type=float,description="posterior weighted="" chi^2="" p-value="" for="" testing="" the="" association="" between="" group1="" and="" group2="" samples."=""> ##INFO=<id=qchi2,number=1,type=integer,description="phred scaled="" pchi2."=""> ##INFO=<id=pr,number=1,type=integer,description="# permutations="" yielding="" a="" smaller="" pchi2."=""> ##INFO=<id=vdb,number=1,type=float,description="variant distance="" bias"=""> ##FORMAT=<id=gt,number=1,type=string,description="genotype"> ##FORMAT=<id=gq,number=1,type=integer,description="genotype quality"=""> ##FORMAT=<id=gl,number=3,type=float,description="likelihoods for="" rr,ra,aa="" genotypes="" (r="ref,A=alt)""> ##FORMAT=<id=dp,number=1,type=integer,description="# high-quality="" bases"=""> ##FORMAT=<id=sp,number=1,type=integer,description="phred-scaled strand="" bias="" p-value"=""> ##FORMAT=<id=pl,number=g,type=integer,description="list of="" phred-="" scaled="" genotype="" likelihoods"=""> ##source_20121102.1=./vcf-merge -s Fd03_high.vcf.gz Fd03_low.vcf.gz Fd03_zero.vcf.gz ##sourceFiles_20121102.1=0:Fd03_high.vcf.gz,1:Fd03_low.vcf.gz,2:Fd03_z ero.vcf.gz ##INFO=<id=sf,number=.,type=string,description="source file="" (index="" to="" sourcefiles,="" f="" when="" filtered)"=""> ##INFO=<id=ac,number=.,type=integer,description="allele count="" in="" genotypes"=""> ##INFO=<id=an,number=1,type=integer,description="total number="" of="" alleles="" in="" called="" genotypes"=""> #CHROM POS ID REF ALT QUAL FILTER INFO FORMAT Fd03_high.vcf Fd03_low.vcf Fd03_zero.vcf 2R 23990061 . G A 152.33 . AC1=1; AC=3;AF1=0.5;AN=6;DP4=3,0,2,4;DP=9;FQ=18.1;MQ=35;PV4=0.17,1,1,1;SF=0,1 ,2;VDB=0.0474 GT:DP4:GQ:DP:PL 0/1:3,0,2,4:48:9:121,0,45 0/1:1,3,6,5:90:15:212,0,87 0/1:2,3,7,5:99:17:214,0,103 2R 23990067 . G A 32.80 . AC1=1; AC=2;AF1=0.5;AN=4;DP4=4,1,2,3;DP=10;FQ=64.8;MQ=35;PV4=0.52,0.022,1,1;S F=0,1,2;VDB=0.0297 GT:DP4:GQ:DP:PL 0/1:4,1,2,3:95:10:92,0,106 .:6,8,2,1:.:17:20,.,. 0/1:8,8,1,4:59:21:56,0,255 2R 23990070 . T C 109.67 . AC1=1; AC=3;AF1=0.5;AN=6;DP4=3,0,3,4;DP=11;FQ=10.4;MQ=35;PV4=0.2,0.091,1,1;SF =0,1,2;VDB=0.0474 GT:DP4:GQ:DP:PL 0/1:3,0,3,4:40:10:104,0,37 0/1:2,3,6,6:99:17:152,0,103 0/1:2,4,7,9:95:22:163,0,92 2R 23990073 . T C 100.33 . AC1=1; AC=3;AF1=0.5;AN=6;DP4=3,0,3,4;DP=12;FQ=16.1;MQ=35;PV4=0.2,0.025,1,1;SF =0,1,2;VDB=0.0504 GT:DP4:GQ:DP:PL 0/1:3,0,3,4:46:10:101,0,43 0/1:2,3,6,5:99:16:134,0,103 0/1:2,4,7,9:99:22:156,0,113 2R 23990083 . T G 99.92 . AC1=1; AC=2;AF1=0.4995;AN=4;DP4=3,3,3,0;DP=10;FQ=3.02;MQ=38;PV4=0.46,5.9e-05, 0.23,1;SF=0,1,2;VDB=0.0426 GT:GQ:DP4:DP:PL .:.:3,3,3,0:9:27,.,. 0/1:38:2,1,6,8:17:165,0,35 0/1:81:1,4,8,10:23:190,0,78 2R 23990100 . A C 114.67 . AC1=1; AC=3;AF1=0.5;AN=6;DP4=4,2,3,1;DP=10;FQ=68;MQ=39;PV4=1,0.41,0.38,0.041; SF=0,1,2;VDB=0.0386 GT:DP4:GQ:DP:PL 0/1:4,2,3,1:98:10:95,0,141 0/1:4,5,3,6:99:18:167,0,172 0/1:4,6,3,6:99:19:172,0,185 2R 23990108 . T A 21.40 . AC1=1; AC=1;AF1=0.5;AN=2;DP4=5,2,3,2;DP=12;FQ=24;MQ=39;PV4=1,3.8e-05,1,1;SF=0 ,1,2;VDB=0.0075 GT:DP4:GQ:DP:PL 0/1:5,2,3,2:54:12:51,0,146 .:8,6,0,3:.:17:16,.,. .:5,10,1,2:.:18:1,.,. 2R 23990114 . C T 113.00 . AC1=1; AC=3;AF1=0.5;AN=6;DP4=6,3,4,1;DP=14;FQ=81;MQ=40;PV4=1,1,0.24,1;SF=0,1, 2;VDB=0.0523 GT:DP4:GQ:DP:PL 0/1:6,3,4,1:99:14:108,0,181 0/1:4,4,3,5:99:16:166,0,147 0/1:3,4,2,7:99:16:155,0,158 2R 23990116 . A T 20.25 . AC1=1; AC=1;AF1=0.4871;AN=2;DP4=8,3,2,1;DP=14;FQ=-14.2;MQ=40;PV4=1,6e-05,0.09 3,0.25;SF=0,1,2;VDB=0.0282 GT:GQ:DP4:DP:PL .:.:8,3,2,1:14:13,.,. 0/1:40:4,9,4,1:18:38,0,204 .:.:5,10,1,1:17:0,.,. 2R 23990120 . G C 189.67 . AC1=1; AC=3;AF1=0.5;AN=6;DP4=4,2,6,3;DP=15;FQ=103;MQ=40;PV4=1,1,0.026,1;SF=0, 1,2;VDB=0.0532 GT:DP4:GQ:DP:PL 0/1:4,2,6,3:99:15:188,0,130 0/1:0,3,8,7:19:18:252,0,16 0/1:2,5,4,8:99:19:219,0,134 2R 23990143 . A C 190.67 . AC1=2; AC=6;AF1=1;AN=6;DP4=0,0,6,4;DP=11;FQ=-57;MQ=43;SF=0,1,2;VDB=0.0436 GT:DP4:GQ:DP:PL 1/1:0,0,6,4:57:10:248,30,0 1/1:0,0,3,6:51:9:212,27,0 1/1:0,0,2,7:51:9:211,27,0 2R 23990147 . A T 15.36 . AC1=1; AC=1;AF1=0.5;AN=2;DP4=5,6,2,1;DP=15;FQ=27;MQ=39;PV4=1,0.25,1,1;SF=0,1, 2;VDB=0.0352 GT:DP4:GQ:DP:PL 0/1:5,6,2,1:57:14:54,0,230 .:7,5,0,2:.:14:15,.,. .:7,6,0,2:.:15:24,.,. 2R 23990163 . G A 38.03 . AC1=1; AC=3;AF1=0.5;AN=6;DP4=2,2,2,3;DP=14;FQ=44;MQ=43;PV4=1,4e-05,0.44,0.19; SF=0,1,2;VDB=0.0532 GT:DP4:GQ:DP:PL 0/1:2,2,2,3:74:9:71,0,106 0/1:0,1,4,1:20:6:66,0,17 0/1:0,2,4,1:51:7:67,0,48 2R 23990164 . T C 24.03 . AC1=1; AC=3;AF1=0.5;AN=6;DP4=4,5,2,3;DP=14;FQ=22;MQ=41;PV4=1,0.00033,1,0.056; SF=0,1,2;VDB=0.0532 GT:DP4:GQ:DP:PL 0/1:4,5,2,3:52:14:49,0,164 0/1:3,2,4,1:56:10:53,0,77 0/1:1,4,4,1:63:10:60,0,96 2R 23990171 . T C 74.67 . AC1=1; AC=3;AF1=0.5;AN=6;DP4=4,5,3,4;DP=16;FQ=71;MQ=41;PV4=1,6.1e-07,0.1,1;SF =0,1,2;VDB=0.0532 GT:DP4:GQ:DP:PL 0/1:4,5,3,4:99:16:98,0,194 0/1:4,2,6,1:99:13:100,0,131 0/1:5,3,3,4:99:15:116,0,173 2R 23990190 . C A 27.34 . AC1=1; AC=1;AF1=0.4997;AN=2;DP4=4,6,2,2;DP=14;FQ=4.77;MQ=43;PV4=1,2.3e-09,1,0 .15;SF=0,1,2;VDB=0.0352 GT:DP4:GQ:DP:PL 0/1:4,6,2,2:28:14:30,0,225 .:8,1,0,1:.:10:0,.,. .:12,5,2,0:.:19:0,.,. 2R 23990198 . G T 26.67 . AC1=0; AC=1;AF1=0;AN=2;DP4=6,7,2,0;DP=15;FQ=-28;MQ=44;PV4=0.47,0.0016,1,0.052 ;SF=0,1,2;VDB=0.0260 GT:GQ:DP4:DP:PL .:.:6,7,2,0:15:0,.,. .:.:6,1,1,0:8:3,.,. 0/1:55:10,2,5,1:18:52,0,200
• 1.0k views
ADD COMMENTlink modified 7.0 years ago by Valerie Obenchain6.7k • written 7.0 years ago by seth redmond70
Answer: scanVcf: FORMAT 'GT' not found
0
gravatar for Valerie Obenchain
7.0 years ago by
United States
Valerie Obenchain6.7k wrote:
Hi Seth, What version of VariantAnnotation are you using? Please provide the output of sessionInfo(). I think there is a spacing problem in the file - are there true tabs between each field? Test using just the first line of the file so you can easily see/modify the tabs. I can't reproduce your error with the file output below. I may be modifying the format as I cut and paste. If looking at the spacing does not solve the problem please attach a small subset of the file - maybe just through the first 5 rows. Valerie On 12/03/2012 03:16 AM, seth redmond wrote: > I keep running into an error in my VCF files but can't seem to pinpoint where the problem is. The file has a number of missing genotypes but nothing that should be causing any problems, I don't think, and it passes vcf-validator without any problem. > Completely unremarkable code and head of the file below: > > Has anyone encountered this before? Or has any suggestions as to what might be the issue? > > thanks > > -s > >> filename<-"tmpvcf.vcf.gz" >> vcftab<- TabixFile(filename, index = paste(filename, "tbi", sep=".")); >> vcfScan<- scanVcf(filename) > trace: scanVcf(filename) > trace: scanVcf(con) > Error: scanVcf: record 1 field 1 FORMAT 'GT' not found > path: tmpvcf.vcf.gz > > bash-3.2$ vcf-validator tmpvcf.vcf.gz > The header tag 'reference' not present. (Not required but highly recommended.) > The header tag 'contig' not present for CHROM=2R. (Not required but highly recommended.) > The header tag 'contig' not present for CHROM=3L. (Not required but highly recommended.) > > ##fileformat=VCFv4.1 > ##samtoolsVersion=0.1.18 (r982:295) > ##INFO=<id=dp,number=1,type=integer,description="raw read="" depth"=""> > ##INFO=<id=dp4,number=4,type=integer,description="# high-quality="" ref-forward="" bases,="" ref-reverse,="" alt-forward="" and="" alt-reverse="" bases"=""> > ##FORMAT=<id=dp4,number=4,type=integer,description="# high-quality="" ref-forward="" bases,="" ref-reverse,="" alt-forward="" and="" alt-reverse="" bases"=""> > ##INFO=<id=mq,number=1,type=integer,description="root-mean-square mapping="" quality="" of="" covering="" reads"=""> > ##INFO=<id=fq,number=1,type=float,description="phred probability="" of="" all="" samples="" being="" the="" same"=""> > ##INFO=<id=af1,number=1,type=float,description="max-likelihood estimate="" of="" the="" first="" alt="" allele="" frequency="" (assuming="" hwe)"=""> > ##INFO=<id=ac1,number=1,type=float,description="max-likelihood estimate="" of="" the="" first="" alt="" allele="" count="" (no="" hwe="" assumption)"=""> > ##INFO=<id=g3,number=3,type=float,description="ml estimate="" of="" genotype="" frequencies"=""> > ##INFO=<id=hwe,number=1,type=float,description="chi^2 based="" hwe="" test="" p-value="" based="" on="" g3"=""> > ##INFO=<id=clr,number=1,type=integer,description="log ratio="" of="" genotype="" likelihoods="" with="" and="" without="" the="" constraint"=""> > ##INFO=<id=ugt,number=1,type=string,description="the most="" probable="" unconstrained="" genotype="" configuration="" in="" the="" trio"=""> > ##INFO=<id=cgt,number=1,type=string,description="the most="" probable="" constrained="" genotype="" configuration="" in="" the="" trio"=""> > ##INFO=<id=pv4,number=4,type=float,description="p-values for="" strand="" bias,="" baseq="" bias,="" mapq="" bias="" and="" tail="" distance="" bias"=""> > ##INFO=<id=pc2,number=2,type=integer,description="phred probability="" of="" the="" nonref="" allele="" frequency="" in="" group1="" samples="" being="" larger="" (,smaller)="" than="" in="" group2."=""> > ##INFO=<id=pchi2,number=1,type=float,description="posterior weighted="" chi^2="" p-value="" for="" testing="" the="" association="" between="" group1="" and="" group2="" samples."=""> > ##INFO=<id=qchi2,number=1,type=integer,description="phred scaled="" pchi2."=""> > ##INFO=<id=pr,number=1,type=integer,description="# permutations="" yielding="" a="" smaller="" pchi2."=""> > ##INFO=<id=vdb,number=1,type=float,description="variant distance="" bias"=""> > ##FORMAT=<id=gt,number=1,type=string,description="genotype"> > ##FORMAT=<id=gq,number=1,type=integer,description="genotype quality"=""> > ##FORMAT=<id=gl,number=3,type=float,description="likelihoods for="" rr,ra,aa="" genotypes="" (r="ref,A=alt)" "=""> > ##FORMAT=<id=dp,number=1,type=integer,description="# high-quality="" bases"=""> > ##FORMAT=<id=sp,number=1,type=integer,description="phred-scaled strand="" bias="" p-value"=""> > ##FORMAT=<id=pl,number=g,type=integer,description="list of="" phred-="" scaled="" genotype="" likelihoods"=""> > ##source_20121102.1=./vcf-merge -s Fd03_high.vcf.gz Fd03_low.vcf.gz Fd03_zero.vcf.gz > ##sourceFiles_20121102.1=0:Fd03_high.vcf.gz,1:Fd03_low.vcf.gz,2:Fd03 _zero.vcf.gz > ##INFO=<id=sf,number=.,type=string,description="source file="" (index="" to="" sourcefiles,="" f="" when="" filtered)"=""> > ##INFO=<id=ac,number=.,type=integer,description="allele count="" in="" genotypes"=""> > ##INFO=<id=an,number=1,type=integer,description="total number="" of="" alleles="" in="" called="" genotypes"=""> > #CHROM POS ID REF ALT QUAL FILTER INFO FORMAT Fd03_high.vcf Fd03_low.vcf Fd03_zero.vcf > 2R 23990061 . G A 152.33 . AC1= 1;AC=3;AF1=0.5;AN=6;DP4=3,0,2,4;DP=9;FQ=18.1;MQ=35;PV4=0.17,1,1,1;SF=0 ,1,2;VDB=0.0474 GT:DP4:GQ:DP:PL 0/1:3,0,2,4:48:9:121,0,45 0/1:1,3,6,5:90:15:212,0,87 0/1:2,3,7,5:99:17:214,0,103 > 2R 23990067 . G A 32.80 . AC1= 1;AC=2;AF1=0.5;AN=4;DP4=4,1,2,3;DP=10;FQ=64.8;MQ=35;PV4=0.52,0.022,1,1 ;SF=0,1,2;VDB=0.0297 GT:DP4:GQ:DP:PL 0/1:4,1,2,3:95:10:92,0,106 .:6,8,2,1:.:17:20,.,. > 0/1:8,8,1,4:59:21:56,0,255 > 2R 23990070 . T C 109.67 . AC1= 1;AC=3;AF1=0.5;AN=6;DP4=3,0,3,4;DP=11;FQ=10.4;MQ=35;PV4=0.2,0.091,1,1; SF=0,1,2;VDB=0.0474 GT:DP4:GQ:DP:PL 0/1:3,0,3,4:40:10:104,0,37 0/1:2,3,6,6:99:17:152,0,103 0/1:2,4,7,9:95:22:163,0,92 > 2R 23990073 . T C 100.33 . AC1= 1;AC=3;AF1=0.5;AN=6;DP4=3,0,3,4;DP=12;FQ=16.1;MQ=35;PV4=0.2,0.025,1,1; SF=0,1,2;VDB=0.0504 GT:DP4:GQ:DP:PL 0/1:3,0,3,4:46:10:101,0,43 0/1:2,3,6,5:99:16:134,0,103 0/1:2,4,7,9:99:22:156,0,113 > 2R 23990083 . T G 99.92 . AC1= 1;AC=2;AF1=0.4995;AN=4;DP4=3,3,3,0;DP=10;FQ=3.02;MQ=38;PV4=0.46,5.9e-0 5,0.23,1;SF=0,1,2;VDB=0.0426 GT:GQ:DP4:DP:PL .:.:3,3,3,0:9:27,.,. 0/1:38:2,1,6,8:17:165,0,35 0/1:81:1,4,8,10:23:190,0,78 > 2R 23990100 . A C 114.67 . AC1= 1;AC=3;AF1=0.5;AN=6;DP4=4,2,3,1;DP=10;FQ=68;MQ=39;PV4=1,0.41,0.38,0.04 1;SF=0,1,2;VDB=0.0386 GT:DP4:GQ:DP:PL 0/1:4,2,3,1:98:10:95,0,141 0/1:4,5,3,6:99:18:167,0,172 0/1:4,6,3,6:99:19:172,0,185 > 2R 23990108 . T A 21.40 . AC1= 1;AC=1;AF1=0.5;AN=2;DP4=5,2,3,2;DP=12;FQ=24;MQ=39;PV4=1,3.8e-05,1,1;SF =0,1,2;VDB=0.0075 GT:DP4:GQ:DP:PL 0/1:5,2,3,2:54:12:51,0,146 .:8,6,0,3:.:17:16,.,. > .:5,10,1,2:.:18:1,.,. > 2R 23990114 . C T 113.00 . AC1= 1;AC=3;AF1=0.5;AN=6;DP4=6,3,4,1;DP=14;FQ=81;MQ=40;PV4=1,1,0.24,1;SF=0, 1,2;VDB=0.0523 GT:DP4:GQ:DP:PL 0/1:6,3,4,1:99:14:108,0,181 0/1:4,4,3,5:99:16:166,0,147 0/1:3,4,2,7:99:16:155,0,158 > 2R 23990116 . A T 20.25 . AC1= 1;AC=1;AF1=0.4871;AN=2;DP4=8,3,2,1;DP=14;FQ=-14.2;MQ=40;PV4=1,6e-05,0. 093,0.25;SF=0,1,2;VDB=0.0282 GT:GQ:DP4:DP:PL .:.:8,3,2,1:14:13,.,. 0/1:40:4,9,4,1:18:38,0,204 .:.:5,10,1,1:17:0,.,. > 2R 23990120 . G C 189.67 . AC1= 1;AC=3;AF1=0.5;AN=6;DP4=4,2,6,3;DP=15;FQ=103;MQ=40;PV4=1,1,0.026,1;SF= 0,1,2;VDB=0.0532 GT:DP4:GQ:DP:PL 0/1:4,2,6,3:99:15:188,0,130 0/1:0,3,8,7:19:18:252,0,16 0/1:2,5,4,8:99:19:219,0,134 > 2R 23990143 . A C 190.67 . AC1= 2;AC=6;AF1=1;AN=6;DP4=0,0,6,4;DP=11;FQ=-57;MQ=43;SF=0,1,2;VDB=0.0436 GT:DP4:GQ:DP:PL 1/1:0,0,6,4:57:10:248,30,0 1/1:0,0,3,6:51:9:212,27,0 1/1:0,0,2,7:51:9:211,27,0 > 2R 23990147 . A T 15.36 . AC1= 1;AC=1;AF1=0.5;AN=2;DP4=5,6,2,1;DP=15;FQ=27;MQ=39;PV4=1,0.25,1,1;SF=0, 1,2;VDB=0.0352 GT:DP4:GQ:DP:PL 0/1:5,6,2,1:57:14:54,0,230 .:7,5,0,2:.:14:15,.,. > .:7,6,0,2:.:15:24,.,. > 2R 23990163 . G A 38.03 . AC1= 1;AC=3;AF1=0.5;AN=6;DP4=2,2,2,3;DP=14;FQ=44;MQ=43;PV4=1,4e-05,0.44,0.1 9;SF=0,1,2;VDB=0.0532 GT:DP4:GQ:DP:PL 0/1:2,2,2,3:74:9:71,0,106 0/1:0,1,4,1:20:6:66,0,17 0/1:0,2,4,1:51:7:67,0,48 > 2R 23990164 . T C 24.03 . AC1= 1;AC=3;AF1=0.5;AN=6;DP4=4,5,2,3;DP=14;FQ=22;MQ=41;PV4=1,0.00033,1,0.05 6;SF=0,1,2;VDB=0.0532 GT:DP4:GQ:DP:PL 0/1:4,5,2,3:52:14:49,0,164 0/1:3,2,4,1:56:10:53,0,77 0/1:1,4,4,1:63:10:60,0,96 > 2R 23990171 . T C 74.67 . AC1= 1;AC=3;AF1=0.5;AN=6;DP4=4,5,3,4;DP=16;FQ=71;MQ=41;PV4=1,6.1e-07,0.1,1; SF=0,1,2;VDB=0.0532 GT:DP4:GQ:DP:PL 0/1:4,5,3,4:99:16:98,0,194 0/1:4,2,6,1:99:13:100,0,131 0/1:5,3,3,4:99:15:116,0,173 > 2R 23990190 . C A 27.34 . AC1= 1;AC=1;AF1=0.4997;AN=2;DP4=4,6,2,2;DP=14;FQ=4.77;MQ=43;PV4=1,2.3e-09,1 ,0.15;SF=0,1,2;VDB=0.0352 GT:DP4:GQ:DP:PL 0/1:4,6,2,2:28:14:30,0,225 .:8,1,0,1:.:10:0,.,. .:12,5,2,0:.:19:0,.,. > 2R 23990198 . G T 26.67 . AC1= 0;AC=1;AF1=0;AN=2;DP4=6,7,2,0;DP=15;FQ=-28;MQ=44;PV4=0.47,0.0016,1,0.0 52;SF=0,1,2;VDB=0.0260 GT:GQ:DP4:DP:PL .:.:6,7,2,0:15:0,.,. .:.:6,1,1,0:8:3,.,. > 0/1:55:10,2,5,1:18:52,0,200 > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
ADD COMMENTlink written 7.0 years ago by Valerie Obenchain6.7k
Urgh, yeah I'd checked the tabs between the columns a hundred times, but I hadn't checked for trailing tabs in the header. thanks for the nudge? -s On 3 Dec 2012, at 18:20, Valerie Obenchain wrote: > Hi Seth, > > What version of VariantAnnotation are you using? Please provide the output of sessionInfo(). > > I think there is a spacing problem in the file - are there true tabs between each field? Test using just the first line of the file so you can easily see/modify the tabs. > > I can't reproduce your error with the file output below. I may be modifying the format as I cut and paste. If looking at the spacing does not solve the problem please attach a small subset of the file - maybe just through the first 5 rows. > > > Valerie > > On 12/03/2012 03:16 AM, seth redmond wrote: >> I keep running into an error in my VCF files but can't seem to pinpoint where the problem is. The file has a number of missing genotypes but nothing that should be causing any problems, I don't think, and it passes vcf-validator without any problem. >> Completely unremarkable code and head of the file below: >> >> Has anyone encountered this before? Or has any suggestions as to what might be the issue? >> >> thanks >> >> -s >> >>> filename<-"tmpvcf.vcf.gz" >>> vcftab<- TabixFile(filename, index = paste(filename, "tbi", sep=".")); >>> vcfScan<- scanVcf(filename) >> trace: scanVcf(filename) >> trace: scanVcf(con) >> Error: scanVcf: record 1 field 1 FORMAT 'GT' not found >> path: tmpvcf.vcf.gz >> >> bash-3.2$ vcf-validator tmpvcf.vcf.gz >> The header tag 'reference' not present. (Not required but highly recommended.) >> The header tag 'contig' not present for CHROM=2R. (Not required but highly recommended.) >> The header tag 'contig' not present for CHROM=3L. (Not required but highly recommended.) >> >> ##fileformat=VCFv4.1 >> ##samtoolsVersion=0.1.18 (r982:295) >> ##INFO=<id=dp,number=1,type=integer,description="raw read="" depth"=""> >> ##INFO=<id=dp4,number=4,type=integer,description="# high-quality="" ref-forward="" bases,="" ref-reverse,="" alt-forward="" and="" alt-reverse="" bases"=""> >> ##FORMAT=<id=dp4,number=4,type=integer,description="# high-quality="" ref-forward="" bases,="" ref-reverse,="" alt-forward="" and="" alt-reverse="" bases"=""> >> ##INFO=<id=mq,number=1,type=integer,description="root-mean-square mapping="" quality="" of="" covering="" reads"=""> >> ##INFO=<id=fq,number=1,type=float,description="phred probability="" of="" all="" samples="" being="" the="" same"=""> >> ##INFO=<id=af1,number=1,type=float,description="max-likelihood estimate="" of="" the="" first="" alt="" allele="" frequency="" (assuming="" hwe)"=""> >> ##INFO=<id=ac1,number=1,type=float,description="max-likelihood estimate="" of="" the="" first="" alt="" allele="" count="" (no="" hwe="" assumption)"=""> >> ##INFO=<id=g3,number=3,type=float,description="ml estimate="" of="" genotype="" frequencies"=""> >> ##INFO=<id=hwe,number=1,type=float,description="chi^2 based="" hwe="" test="" p-value="" based="" on="" g3"=""> >> ##INFO=<id=clr,number=1,type=integer,description="log ratio="" of="" genotype="" likelihoods="" with="" and="" without="" the="" constraint"=""> >> ##INFO=<id=ugt,number=1,type=string,description="the most="" probable="" unconstrained="" genotype="" configuration="" in="" the="" trio"=""> >> ##INFO=<id=cgt,number=1,type=string,description="the most="" probable="" constrained="" genotype="" configuration="" in="" the="" trio"=""> >> ##INFO=<id=pv4,number=4,type=float,description="p-values for="" strand="" bias,="" baseq="" bias,="" mapq="" bias="" and="" tail="" distance="" bias"=""> >> ##INFO=<id=pc2,number=2,type=integer,description="phred probability="" of="" the="" nonref="" allele="" frequency="" in="" group1="" samples="" being="" larger="" (,smaller)="" than="" in="" group2."=""> >> ##INFO=<id=pchi2,number=1,type=float,description="posterior weighted="" chi^2="" p-value="" for="" testing="" the="" association="" between="" group1="" and="" group2="" samples."=""> >> ##INFO=<id=qchi2,number=1,type=integer,description="phred scaled="" pchi2."=""> >> ##INFO=<id=pr,number=1,type=integer,description="# permutations="" yielding="" a="" smaller="" pchi2."=""> >> ##INFO=<id=vdb,number=1,type=float,description="variant distance="" bias"=""> >> ##FORMAT=<id=gt,number=1,type=string,description="genotype"> >> ##FORMAT=<id=gq,number=1,type=integer,description="genotype quality"=""> >> ##FORMAT=<id=gl,number=3,type=float,description="likelihoods for="" rr,ra,aa="" genotypes="" (r="ref,A=alt)" "=""> >> ##FORMAT=<id=dp,number=1,type=integer,description="# high-quality="" bases"=""> >> ##FORMAT=<id=sp,number=1,type=integer,description="phred-scaled strand="" bias="" p-value"=""> >> ##FORMAT=<id=pl,number=g,type=integer,description="list of="" phred-="" scaled="" genotype="" likelihoods"=""> >> ##source_20121102.1=./vcf-merge -s Fd03_high.vcf.gz Fd03_low.vcf.gz Fd03_zero.vcf.gz >> ##sourceFiles_20121102.1=0:Fd03_high.vcf.gz,1:Fd03_low.vcf.gz,2:Fd0 3_zero.vcf.gz >> ##INFO=<id=sf,number=.,type=string,description="source file="" (index="" to="" sourcefiles,="" f="" when="" filtered)"=""> >> ##INFO=<id=ac,number=.,type=integer,description="allele count="" in="" genotypes"=""> >> ##INFO=<id=an,number=1,type=integer,description="total number="" of="" alleles="" in="" called="" genotypes"=""> >> #CHROM POS ID REF ALT QUAL FILTER INFO FORMAT Fd03_high.vcf Fd03_low.vcf Fd03_zero.vcf >> 2R 23990061 . G A 152.33 . AC1 =1;AC=3;AF1=0.5;AN=6;DP4=3,0,2,4;DP=9;FQ=18.1;MQ=35;PV4=0.17,1,1,1;SF= 0,1,2;VDB=0.0474 GT:DP4:GQ:DP:PL 0/1:3,0,2,4:48:9:121,0,45 0/1:1,3,6,5:90:15:212,0,87 0/1:2,3,7,5:99:17:214,0,103 >> 2R 23990067 . G A 32.80 . AC1 =1;AC=2;AF1=0.5;AN=4;DP4=4,1,2,3;DP=10;FQ=64.8;MQ=35;PV4=0.52,0.022,1, 1;SF=0,1,2;VDB=0.0297 GT:DP4:GQ:DP:PL 0/1:4,1,2,3:95:10:92,0,106 .:6,8,2,1:.:17:20,.,. >> 0/1:8,8,1,4:59:21:56,0,255 >> 2R 23990070 . T C 109.67 . AC1 =1;AC=3;AF1=0.5;AN=6;DP4=3,0,3,4;DP=11;FQ=10.4;MQ=35;PV4=0.2,0.091,1,1 ;SF=0,1,2;VDB=0.0474 GT:DP4:GQ:DP:PL 0/1:3,0,3,4:40:10:104,0,37 0/1:2,3,6,6:99:17:152,0,103 0/1:2,4,7,9:95:22:163,0,92 >> 2R 23990073 . T C 100.33 . AC1 =1;AC=3;AF1=0.5;AN=6;DP4=3,0,3,4;DP=12;FQ=16.1;MQ=35;PV4=0.2,0.025,1,1 ;SF=0,1,2;VDB=0.0504 GT:DP4:GQ:DP:PL 0/1:3,0,3,4:46:10:101,0,43 0/1:2,3,6,5:99:16:134,0,103 0/1:2,4,7,9:99:22:156,0,113 >> 2R 23990083 . T G 99.92 . AC1 =1;AC=2;AF1=0.4995;AN=4;DP4=3,3,3,0;DP=10;FQ=3.02;MQ=38;PV4=0.46,5.9e- 05,0.23,1;SF=0,1,2;VDB=0.0426 GT:GQ:DP4:DP:PL .:.:3,3,3,0:9:27,.,. 0/1:38:2,1,6,8:17:165,0,35 0/1:81:1,4,8,10:23:190,0,78 >> 2R 23990100 . A C 114.67 . AC1 =1;AC=3;AF1=0.5;AN=6;DP4=4,2,3,1;DP=10;FQ=68;MQ=39;PV4=1,0.41,0.38,0.0 41;SF=0,1,2;VDB=0.0386 GT:DP4:GQ:DP:PL 0/1:4,2,3,1:98:10:95,0,141 0/1:4,5,3,6:99:18:167,0,172 0/1:4,6,3,6:99:19:172,0,185 >> 2R 23990108 . T A 21.40 . AC1 =1;AC=1;AF1=0.5;AN=2;DP4=5,2,3,2;DP=12;FQ=24;MQ=39;PV4=1,3.8e-05,1,1;S F=0,1,2;VDB=0.0075 GT:DP4:GQ:DP:PL 0/1:5,2,3,2:54:12:51,0,146 .:8,6,0,3:.:17:16,.,. >> .:5,10,1,2:.:18:1,.,. >> 2R 23990114 . C T 113.00 . AC1 =1;AC=3;AF1=0.5;AN=6;DP4=6,3,4,1;DP=14;FQ=81;MQ=40;PV4=1,1,0.24,1;SF=0 ,1,2;VDB=0.0523 GT:DP4:GQ:DP:PL 0/1:6,3,4,1:99:14:108,0,181 0/1:4,4,3,5:99:16:166,0,147 0/1:3,4,2,7:99:16:155,0,158 >> 2R 23990116 . A T 20.25 . AC1 =1;AC=1;AF1=0.4871;AN=2;DP4=8,3,2,1;DP=14;FQ=-14.2;MQ=40;PV4=1,6e-05,0 .093,0.25;SF=0,1,2;VDB=0.0282 GT:GQ:DP4:DP:PL .:.:8,3,2,1:14:13,.,. 0/1:40:4,9,4,1:18:38,0,204 .:.:5,10,1,1:17:0,.,. >> 2R 23990120 . G C 189.67 . AC1 =1;AC=3;AF1=0.5;AN=6;DP4=4,2,6,3;DP=15;FQ=103;MQ=40;PV4=1,1,0.026,1;SF =0,1,2;VDB=0.0532 GT:DP4:GQ:DP:PL 0/1:4,2,6,3:99:15:188,0,130 0/1:0,3,8,7:19:18:252,0,16 0/1:2,5,4,8:99:19:219,0,134 >> 2R 23990143 . A C 190.67 . AC1 =2;AC=6;AF1=1;AN=6;DP4=0,0,6,4;DP=11;FQ=-57;MQ=43;SF=0,1,2;VDB=0.0436 GT:DP4:GQ:DP:PL 1/1:0,0,6,4:57:10:248,30,0 1/1:0,0,3,6:51:9:212,27,0 1/1:0,0,2,7:51:9:211,27,0 >> 2R 23990147 . A T 15.36 . AC1 =1;AC=1;AF1=0.5;AN=2;DP4=5,6,2,1;DP=15;FQ=27;MQ=39;PV4=1,0.25,1,1;SF=0 ,1,2;VDB=0.0352 GT:DP4:GQ:DP:PL 0/1:5,6,2,1:57:14:54,0,230 .:7,5,0,2:.:14:15,.,. >> .:7,6,0,2:.:15:24,.,. >> 2R 23990163 . G A 38.03 . AC1 =1;AC=3;AF1=0.5;AN=6;DP4=2,2,2,3;DP=14;FQ=44;MQ=43;PV4=1,4e-05,0.44,0. 19;SF=0,1,2;VDB=0.0532 GT:DP4:GQ:DP:PL 0/1:2,2,2,3:74:9:71,0,106 0/1:0,1,4,1:20:6:66,0,17 0/1:0,2,4,1:51:7:67,0,48 >> 2R 23990164 . T C 24.03 . AC1 =1;AC=3;AF1=0.5;AN=6;DP4=4,5,2,3;DP=14;FQ=22;MQ=41;PV4=1,0.00033,1,0.0 56;SF=0,1,2;VDB=0.0532 GT:DP4:GQ:DP:PL 0/1:4,5,2,3:52:14:49,0,164 0/1:3,2,4,1:56:10:53,0,77 0/1:1,4,4,1:63:10:60,0,96 >> 2R 23990171 . T C 74.67 . AC1 =1;AC=3;AF1=0.5;AN=6;DP4=4,5,3,4;DP=16;FQ=71;MQ=41;PV4=1,6.1e-07,0.1,1 ;SF=0,1,2;VDB=0.0532 GT:DP4:GQ:DP:PL 0/1:4,5,3,4:99:16:98,0,194 0/1:4,2,6,1:99:13:100,0,131 0/1:5,3,3,4:99:15:116,0,173 >> 2R 23990190 . C A 27.34 . AC1 =1;AC=1;AF1=0.4997;AN=2;DP4=4,6,2,2;DP=14;FQ=4.77;MQ=43;PV4=1,2.3e-09, 1,0.15;SF=0,1,2;VDB=0.0352 GT:DP4:GQ:DP:PL 0/1:4,6,2,2:28:14:30,0,225 .:8,1,0,1:.:10:0,.,. .:12,5,2,0:.:19:0,.,. >> 2R 23990198 . G T 26.67 . AC1 =0;AC=1;AF1=0;AN=2;DP4=6,7,2,0;DP=15;FQ=-28;MQ=44;PV4=0.47,0.0016,1,0. 052;SF=0,1,2;VDB=0.0260 GT:GQ:DP4:DP:PL .:.:6,7,2,0:15:0,.,. .:.:6,1,1,0:8:3,.,. >> 0/1:55:10,2,5,1:18:52,0,200 >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at r-project.org >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor >
ADD REPLYlink written 7.0 years ago by seth redmond70
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 167 users visited in the last hour