#### Posts by Daniel Cameron

... VF, BANRP, and BANSR are all GRIDSS-specific fields. Unfortunately, there are no VCF-defined equivalants of the AD and DP fields suitable for structural variants so each caller uses it's own custom fields (or doesn't report at all). From a specifications perspective, QUAL is the only suita ...
written 27 days ago by Daniel Cameron20
... When loading a VCF file, one of the records is ignored. The missing record has a 4094 byte record immediately prior to it and any change to the length of that record makes the loading succeed again. Changing the content of the prior record without changing the length also causes the record to fail t ...
written 11 months ago by Daniel Cameron20
... I'm looking for the closest upstream TSS for viral insertion sites. This means my query strand is *, but I do care about transcript orientation (subjects are coordinate and strand of TSS). In the case of overlapping genes, distanceToNearest() is not returning the one with the closest TSS. ...
written 23 months ago by Daniel Cameron20
... But the closest subject ranges are at positions 1 and 2, not 2 and 3 as can be seen by calling nearest(): > nearest(GRanges(seqnames="chr1", ranges=IRanges(start=1100, width=1), strand="*"), subject, select="all") Hits object with 2 hits and 0 metadata columns:       queryHits subjectHits       ...
written 23 months ago by Daniel Cameron20
... I am attempting to find the distance to the closest upstream but distanceToNearest does not appear to return the closest. A simple reproduction is as follows:   > subject <- GRanges(seqnames="chr1", ranges=IRanges(start=c(1000, 1500, 2000, 2500, 3000, 3500), width=1),strand=c("+", "-", "+" ...
written 23 months ago by Daniel Cameron20 • updated 23 months ago by Valerie Obenchain6.7k
... Thanks very much for your informative response. Are you open to patch submission on the VariantAnnotation package? In this particular case, make.unique() on just the variants generated by VariantAnnotation would given uniqueness without breaking the existing naming convention. I've done this in my o ...
written 3.6 years ago by Daniel Cameron20
... >If you're getting an error that says VCF data must have unique rownames please let me know and show a small example. Here is a small example showing the problem: writeVcf(.testrecord(c("chr1\t100\t.\tA\t<DEL>\t.\t.\tSVLEN=-1", "chr1\t100\t.\tA\t<DEL>\t.\t.\tSVLEN=-100")), "test.vc ...
written 3.6 years ago by Daniel Cameron20
... https://samtools.github.io/hts-specs/VCFv4.2.pdf 1.4.1.3 ID - identifier: Semi-colon separated list of unique identifiers where available. If this is a dbSNP variant it is encouraged to use the rs number(s). No identifier should be present in more than one data record. If there is no identifier ava ...
written 3.6 years ago by Daniel Cameron20
... The issue is that VariantAnnotation automatically creates names for the VCF rows that were not assigned any name. If these names are are used to write a VCF file, the resultant VCF will not be valid, because uniqueness is required by the VCF specifications. In my code, I'm using the VCF ID as a uni ...
written 3.6 years ago by Daniel Cameron20
... VCF requires unique row names, but the formula used to generate placeholder row names can produce duplicates (Such as for pindel with default parameters run on the NA12878 Illumina platinum genomic WGS data). Is this intentional, or just an oversight due to the lack of structural variant data out th ...
written 3.6 years ago by Daniel Cameron20 • updated 3.6 years ago by Valerie Obenchain6.7k

