Dear all,
i would appreciate please your suggestions on the proper use of Vranges. I have a vcf file with somatic variants that looks in the following way (below). I would like to select/filter only the variants that have an AD > 5 and AF > 0.05 in the TUMOR sample (the vcf file was generated by MUTECT2).
The code below works :
svp <- ScanVcfParam(geno=c("AD","AF"), samples="TUMOR")
vcf2 <- readVcfAsVRanges("AML_expanded.vcf", "hg38", svp)
however, why the other piece of code (below) does not work properly, when I import both fields for NORMAL and TUMOR ? Thank you very much !
svp <- ScanVcfParam(info="DP", geno=c("AD","AF"), samples=c("NORMAL","TUMOR"))
A few lines in the VCF file look in the following way :
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT NORMAL TUMOR
chr1 108044 chr1:108044_C/G C G . PASS BaseCounts=0,45,5,0;ECNT=1;FS=8.486;GC=39.6;HCNT=1;HRun=0;LowMQ=0,0.06,50;NLOD=2.41;SOR=1.874;TLOD=7.89;VariantType=SNP GT:AD:AF:ALT_F1R2:ALT_F2R1:FOXOG:QSS:REF_F1R2:REF_F2R1 0/0:8,0:0:0:0:.:238:6:2 0/1:38,5:0.116:3:2:0.4:1113:19:19
chr1 123100 chr1:123100_A/ATG A ATG . PASS BaseCounts=65,0,3,0;ECNT=1;GC=27.72;HCNT=1;HRun=0;LowMQ=0,0.0441,68;NLOD=5.06;RPA=5,6;RU=TG;STR;TLOD=11.79;VariantType=INSERTION.NumRepetitions_5.EventLength_2.RepeatExpansion_TG GT:AD:AF:ALT_F1R2:ALT_F2R1:QSS:REF_F1R2:REF_F2R1 0/0:32,4:0.058:0:0:903:17:13 0/1:31,13:0.173:2:1:885:13:18
chr1 187017 chr1:187017_G/C G C . PASS BaseCounts=0,10,161,0;ECNT=1;FS=4.83;GC=29.7;HCNT=1;HRun=0;LowMQ=0,0.0819,171;NLOD=28.89;SOR=1.59;TLOD=7.71;VariantType=SNP GT:AD:AF:ALT_F1R2:ALT_F2R1:FOXOG:QSS:REF_F1R2:REF_F2R1 0/0:101,0:0:0:0:.:3012:50:51 0/1:133,7:0.05:3:4:0.429:3830:59:74
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT NORMAL TUMOR
chr1 108044 chr1:108044_C/G C G . PASS BaseCounts=0,45,5,0;ECNT=1;FS=8.486;GC=39.6;HCNT=1;HRun=0;LowMQ=0,0.06,50;NLOD=2.41;SOR=1.874;TLOD=7.89;VariantType=SNP GT:AD:AF:ALT_F1R2:ALT_F2R1:FOXOG:QSS:REF_F1R2:REF_F2R1 0/0:8,0:0:0:0:.:238:6:2 0/1:38,5:0.116:3:2:0.4:1113:19:19
What does 'does not work properly' mean?
Hi Martin, thank you for your question : I have re-run the R code, and excepting < 10 variants in the vcf file, everything else look fine, and the variants are printed in the output file. Thanks a lot, and happy weekend !