writeVcf no longer writes records for ungenotyped data
1
2
Entering edit mode
@daniel-cameron-6227
Last seen 8.7 years ago
Australia

VariantAnnotation_1.8.13 was writing records correctly, but VariantAnnotation_1.12.3 no longer writes ungenotyped data to disk. The VCF headers are successfully written but variants are not. The source code indicates the problem is that VariantAnnotation:::.makeVcfGeno does not write the records to disk as they are flushed by the C .make_vcf_geno function which is only called if genotypes exist.

As a workaround, I am using writeLines(c(VariantAnnotation:::.makeVcfHeader(vcf), VariantAnnotation:::.makeVcfMatrix(NULL, vcf)), file) instead of writeVcf(vcf, file) but a proper fix would be very much appreciated.

 


 

> sessionInfo()
R version 3.1.2 (2014-10-31)
Platform: x86_64-pc-linux-gnu (64-bit)

locale:
[1] C

attached base packages:
[1] stats4    parallel  stats     graphics  grDevices utils     datasets
[8] methods   base

other attached packages:
[1] VariantAnnotation_1.12.3 Rsamtools_1.18.1         Biostrings_2.34.0
[4] XVector_0.6.0            GenomicRanges_1.18.1     GenomeInfoDb_1.2.2
[7] IRanges_2.0.0            S4Vectors_0.4.0          BiocGenerics_0.12.0

loaded via a namespace (and not attached):
 [1] AnnotationDbi_1.28.1    BBmisc_1.8              BSgenome_1.34.0
 [4] BatchJobs_1.5           Biobase_2.26.0          BiocParallel_1.0.0
 [7] DBI_0.3.1               GenomicAlignments_1.2.1 GenomicFeatures_1.18.2
[10] RCurl_1.95-4.3          RSQLite_1.0.0           XML_3.98-1.1
[13] base64enc_0.1-2         biomaRt_2.22.0          bitops_1.0-6
[16] brew_1.0-6              checkmate_1.5.0         codetools_0.2-9
[19] digest_0.6.4            fail_1.2                foreach_1.4.2
[22] iterators_1.0.7         rtracklayer_1.26.1      sendmailR_1.2-1
[25] stringr_0.6.2           tools_3.1.2             zlibbioc_1.12.0

 

variantannotation • 1.4k views
ADD COMMENT
2
Entering edit mode
@valerie-obenchain-4275
Last seen 2.9 years ago
United States

Thanks Daniel, I'll look into it.

By ungenotyped data do you mean GT is './.' or there is no GT field at all? To be sure I address your specific case, can you provide a small sample file off-line?

Valerie

ADD COMMENT
1
Entering edit mode

No GT field at all. The following script reproduces the problem:

library(VariantAnnotation)
f <- file("test.vcf")
writeLines(c(
  "#CHROM\tPOS\tID\tREF\tALT\tQUAL\tFILTER\tINFO",
  "chr1\t1\tvariant1\tA\t[chr1:1[a\t1\t.\tATTR=value",
  "chr2\t2\tvariant2\tA\t[chr2:2[a\t2\t.\tATTR=value"
  ), f)
close(f)
vcf <- readVcf("test.vcf", "hg19")
writeVcf(vcf, "test-unchanged-expect-2-records.vcf")
writeVcf(vcf[fixed(vcf)$QUAL==1,], "test-subset-expect-1-record.vcf")

ADD REPLY
1
Entering edit mode

Thanks for the example. Now fixed in release (1.12.4) and devel (1.13.11).

The bug was a hold over from when data were written out from R. As you've seen, we've since moved the writing down to C. Thanks for reporting the bug.

Valerie

 

ADD REPLY
0
Entering edit mode

I am using "VariantAnnotation_1.12.9", I am reading in VCF using readVcf, then writing out using writeVcf, and no-call genotypes get converted to ref, ie.: "./." becomes "0/0", is this normal?

ADD REPLY
1
Entering edit mode

I don't see that behavior in release (1.12.9) or devel (1.13.25). No-call genotypes are read in as a single dot '.' and written out the same; they are not replaced by '0/0'.

In this test file all GT are './.':

fl <- system.file("unitTests", "cases", "no_INFO_header.vcf", 
                  package="VariantAnnotation")
vcf <- readVcf(fl, "")
> geno(vcf)$GT
              587338
1:2020003_G/A "."   
1:2158344_C/T "."   
1:2179799_C/T "."   
1:2229439_C/G "."   
1:2229827_G/A "."

This has always been the behavior in read/writeVcf(). If there is a need/request, I can look into reading and writing as './.' instead of the single dot.

If you have a reproducible example of where './.' is written as '0/0' please post it or send to me off line.

Valerie

ADD REPLY

Login before adding your answer.

Traffic: 929 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6