does readVcf mistakenly ignore ploidy for missing genotypes?
0
0
Entering edit mode
@timotheeflutre-6727
Last seen 3.1 years ago
France

Minimal reproducible example:

library(VariantAnnotation)
con <- url("https://raw.githubusercontent.com/timflutre/rutilstimflutre/master/inst/extdata/example.vcf")
vcf.txt <- readLines(con)
close(con)
vcf.file <- "example.vcf"
writeLines(vcf.txt, vcf.file)
vcf <- readVcf(vcf.file)
geno(vcf)$GT

which returns:

ind1 ind2 ind3
snp1   "0/0" "0/1" "1/1"
snp2   "0/1" "."   "."
indel1 "0/0" "0/1" "1/1"

However, "." should be "./.", as in the input file and in the VCF format specification. Or am I missing something?

ps: here is my sessionInfo()

R version 3.3.2 (2016-10-31)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 14.04.5 LTS

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats4    parallel  stats     graphics  grDevices utils     datasets
[8] methods   base     

other attached packages:
 [1] VariantAnnotation_1.20.2   Rsamtools_1.26.1          
 [3] Biostrings_2.42.0          XVector_0.14.0            
 [5] SummarizedExperiment_1.4.0 Biobase_2.34.0            
 [7] GenomicRanges_1.26.1       GenomeInfoDb_1.10.1       
 [9] IRanges_2.8.1              S4Vectors_0.12.0          
[11] BiocGenerics_0.20.0       

loaded via a namespace (and not attached):
 [1] Rcpp_0.12.8              AnnotationDbi_1.36.0     GenomicAlignments_1.10.0
 [4] zlibbioc_1.20.0          BiocParallel_1.8.1       BSgenome_1.42.0         
 [7] lattice_0.20-34          tools_3.3.2              grid_3.3.2              
[10] DBI_0.5-1                digest_0.6.10            Matrix_1.2-8            
[13] rtracklayer_1.34.1       bitops_1.0-6             biomaRt_2.30.0          
[16] RCurl_1.95-4.8           memoise_1.0.0            RSQLite_1.1-2           
[19] compiler_3.3.2           GenomicFeatures_1.26.0   XML_3.98-1.5            

 

readvcf gt missing VariantAnnotation • 696 views
ADD COMMENT
0
Entering edit mode

Hi,

Yes, readVcf() does currently ignore ploidy - all missing values are represented with a single '.' dot. This is a good suggestion and we'll make the change. There are several things before this on the TODO but we'll get to it as soon as we can.

Valerie

ADD REPLY
0
Entering edit mode

That would be great, thanks!

ADD REPLY

Login before adding your answer.

Traffic: 195 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6