After running
example(readVcf)
I have an object (note that it has two dim
ensions, with 5 variants and 3 samples)
> vcf
class: CollapsedVCF
dim: 5 3
rowRanges(vcf):
GRanges with 5 metadata columns: paramRangeID, REF, ALT, QUAL, FILTER
info(vcf):
DFrame with 1 column: AF
info(header(vcf)):
Number Type Description
AF A Float Allele Frequency
geno(vcf):
SimpleList of length 1: HQ
geno(header(vcf)):
Number Type Description
HQ 2 Integer Haplotype Quality
As suggested by the display, the FILTER
field is accessible as rowRanges()
, and then with, e.g., $FILTER
> rowRanges(vcf)
GRanges object with 5 ranges and 5 metadata columns:
seqnames ranges strand | paramRangeID REF
<Rle> <IRanges> <Rle> | <factor> <DNAStringSet>
rs6054257 20 14370 * | geneA G
20:17330_T/A 20 17330 * | geneA T
rs6040355 20 1110696 * | geneB A
20:1230237_T/. 20 1230237 * | geneB T
microsat1 20 1234567-1234569 * | geneB GTC
ALT QUAL FILTER
<DNAStringSetList> <numeric> <character>
rs6054257 A 29 PASS
20:17330_T/A A 3 q10
rs6040355 G,T 67 PASS
20:1230237_T/. 47 PASS
microsat1 G,GTCT 50 PASS
-------
seqinfo: 1 sequence from hg19 genome
> rowRanges(vcf)$FILTER
[1] "PASS" "q10" "PASS" "PASS" "PASS"
We'd like to keep features (variants) with rowRanges(vcf)$FILTER == "PASS"
, and all samples, expecting 4 rows and 3 column
> vcf[ rowRanges(vcf)$FILTER == "PASS",]
class: CollapsedVCF
dim: 4 3
rowRanges(vcf):
GRanges with 5 metadata columns: paramRangeID, REF, ALT, QUAL, FILTER
info(vcf):
DFrame with 1 column: AF
info(header(vcf)):
Number Type Description
AF A Float Allele Frequency
geno(vcf):
SimpleList of length 1: HQ
geno(header(vcf)):
Number Type Description
HQ 2 Integer Haplotype Quality
Is that what you were looking for?
subset
is a convenience around the elements that occur exactly once on each row
> subset(vcf, FILTER == "PASS")
class: CollapsedVCF
dim: 4 3
rowRanges(vcf):
GRanges with 5 metadata columns: paramRangeID, REF, ALT, QUAL, FILTER
info(vcf):
DFrame with 1 column: AF
info(header(vcf)):
Number Type Description
AF A Float Allele Frequency
geno(vcf):
SimpleList of length 1: HQ
geno(header(vcf)):
Number Type Description
HQ 2 Integer Haplotype Quality
Thanks, that makes it perfectly clear.