I have a list of variants (in vcf format) called across two samples. I want to use multiple criteria to select a subset of these variants.
For example, I want to select "G" to "A" or "C" to "T" changes, since I'm only interested in these two specific type of SNPs.
Also, I want these SNPs have certain GT call combinations in the two samples: "0/1" GT call for sample 1 and "1/1" GT call for sample 2; or cannot be "1/1" for both samples at the same time.
What's the best way to achieve this? I had difficulty combining these criteria.
class: CollapsedVCF dim: 309482 2 rowRanges(vcf): GRanges with 5 metadata columns: paramRangeID, REF, ALT, QUAL, FILTER info(vcf): DataFrame with 17 columns: INDEL, IDV, IMF, DP, VDB, RPB, MQB, BQB, MQSB, SGB, MQ0F, ICB, HOB, AC, AN, DP4, MQ info(header(vcf)): Number Type Description INDEL 0 Flag Indicates that the variant is an INDEL. IDV 1 Integer Maximum number of reads supporting an indel IMF 1 Float Maximum fraction of reads supporting an indel DP 1 Integer Raw read depth VDB 1 Float Variant Distance Bias for filtering splice-site artefacts in RNA-seq data (bigger is better),Version RPB 1 Float Mann-Whitney U test of Read Position Bias (bigger is better) MQB 1 Float Mann-Whitney U test of Mapping Quality Bias (bigger is better) BQB 1 Float Mann-Whitney U test of Base Quality Bias (bigger is better) MQSB 1 Float Mann-Whitney U test of Mapping Quality vs Strand Bias (bigger is better) SGB 1 Float Segregation based metric. MQ0F 1 Float Fraction of MQ0 reads (smaller is better) ICB 1 Float Inbreeding Coefficient Binomial test (bigger is better) HOB 1 Float Bias in the number of HOMs number (smaller is better) AC A Integer Allele count in genotypes for each ALT allele, in the same order as listed AN 1 Integer Total number of alleles in called genotypes DP4 4 Integer Number of high-quality ref-forward , ref-reverse, alt-forward and alt-reverse bases MQ 1 Integer Average mapping quality geno(vcf): SimpleList of length 2: GT, PL geno(header(vcf)): Number Type Description GT 1 String Genotype PL G Integer List of Phred-scaled genotype likelihoods