Search
Question: does readVcf mistakenly ignore ploidy for missing genotypes?
0
gravatar for TimothéeFlutre
8 months ago by
France
TimothéeFlutre70 wrote:

Minimal reproducible example:

library(VariantAnnotation)
con <- url("https://raw.githubusercontent.com/timflutre/rutilstimflutre/master/inst/extdata/example.vcf")
vcf.txt <- readLines(con)
close(con)
vcf.file <- "example.vcf"
writeLines(vcf.txt, vcf.file)
vcf <- readVcf(vcf.file)
geno(vcf)$GT

which returns:

ind1 ind2 ind3
snp1   "0/0" "0/1" "1/1"
snp2   "0/1" "."   "."
indel1 "0/0" "0/1" "1/1"

However, "." should be "./.", as in the input file and in the VCF format specification. Or am I missing something?

ps: here is my sessionInfo()

R version 3.3.2 (2016-10-31)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 14.04.5 LTS

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats4    parallel  stats     graphics  grDevices utils     datasets
[8] methods   base     

other attached packages:
 [1] VariantAnnotation_1.20.2   Rsamtools_1.26.1          
 [3] Biostrings_2.42.0          XVector_0.14.0            
 [5] SummarizedExperiment_1.4.0 Biobase_2.34.0            
 [7] GenomicRanges_1.26.1       GenomeInfoDb_1.10.1       
 [9] IRanges_2.8.1              S4Vectors_0.12.0          
[11] BiocGenerics_0.20.0       

loaded via a namespace (and not attached):
 [1] Rcpp_0.12.8              AnnotationDbi_1.36.0     GenomicAlignments_1.10.0
 [4] zlibbioc_1.20.0          BiocParallel_1.8.1       BSgenome_1.42.0         
 [7] lattice_0.20-34          tools_3.3.2              grid_3.3.2              
[10] DBI_0.5-1                digest_0.6.10            Matrix_1.2-8            
[13] rtracklayer_1.34.1       bitops_1.0-6             biomaRt_2.30.0          
[16] RCurl_1.95-4.8           memoise_1.0.0            RSQLite_1.1-2           
[19] compiler_3.3.2           GenomicFeatures_1.26.0   XML_3.98-1.5            

 

ADD COMMENTlink modified 8 months ago • written 8 months ago by TimothéeFlutre70

Hi,

Yes, readVcf() does currently ignore ploidy - all missing values are represented with a single '.' dot. This is a good suggestion and we'll make the change. There are several things before this on the TODO but we'll get to it as soon as we can.

Valerie

ADD REPLYlink written 8 months ago by Valerie Obenchain ♦♦ 6.4k

That would be great, thanks!

ADD REPLYlink written 8 months ago by TimothéeFlutre70
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.2.0
Traffic: 151 users visited in the last hour