Question: writeVcf bug
0
gravatar for Richard Pearson
6.3 years ago by
Richard Pearson60 wrote:
Hi I've noticed a bug with writeVcf when trying to write out relatively simple vcf files, e.g. a file containing only GT in the FORMAT fields. The following is a (hopefully) reproducible example based on examples in ?writeVcf library(VariantAnnotation) fl <- system.file("extdata", "ex2.vcf", package="VariantAnnotation") out1.vcf <- tempfile() out2.vcf <- tempfile() in1 <- readVcf(fl, "hg19") writeVcf(in1, out1.vcf) in2 <- readVcf(fl, "hg19", ScanVcfParam(geno="GT")) writeVcf(in2, out2.vcf) This last line gives me: Error in row(genoMat) : a matrix-like object is required as argument to 'row' Looking at the following code snippet from .makeVcfGeno, it seems genoMat is first getting created as a matrix, but then changed to something that is not a matrix by "genoMat <- genoMatFlat". The error then comes from the "row(genoMat)" on the last line shown. I don't really understand what is going on in this code so can't offer a patch I'm afraid. genoMat <- matrix(unlist(as.list(geno), use.names = FALSE, recursive = FALSE), nsub * nrec, length(geno)) genoMatFlat <- as.character(unlist(genoMat)) genoMatFlat[is.na(genoMatFlat)] <- "." if (is.list(genoMat)) { genoMatList <- relist(genoMatFlat, PartitioningByEnd(genoMat)) genoMatFlat <- .pasteCollapse(genoMatList, ",") genoMat <- matrix(genoMatFlat, nrow(genoMat), ncol(genoMat)) } else genoMat <- genoMatFlat formatMatPerSub <- matrix(rep(t(formatMat), nsub), nsub * nrec, length(geno), byrow = TRUE) keep <- !is.na(formatMatPerSub) genoListBySub <- seqsplit(genoMat[keep], row(genoMat)[keep]) FWIW, as a temporary fix I made an extra, slightly more complex FORMAT field using the following: geno(in2)[["DUMMY"]] <- matrix(sapply(as.vector(geno(in2)[["GT"]]), function(x) list(c(x, x))), ncol=ncol(geno(in2)[["GT"]]), dimnames=dimnames(geno(in2)[["GT"]])) writeVcf(in2, out2.vcf) I've checked and it seems this bug is also present in latest devel. Any chance of a fix? Thanks Richard > sessionInfo() R version 2.15.1 (2012-06-22) Platform: x86_64-unknown-linux-gnu (64-bit) locale: [1] LC_CTYPE=en_GB.UTF-8 LC_NUMERIC=C LC_TIME=en_GB.UTF-8 LC_COLLATE=en_GB.UTF-8 LC_MONETARY=en_GB.UTF-8 LC_MESSAGES=en_GB.UTF-8 LC_PAPER=C LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] VariantAnnotation_1.4.5 Rsamtools_1.10.2 Biostrings_2.26.2 GenomicRanges_1.10.5 IRanges_1.16.4 BiocGenerics_0.4.0 devtools_0.8 BiocInstaller_1.8.3 loaded via a namespace (and not attached): [1] AnnotationDbi_1.20.3 Biobase_2.18.0 biomaRt_2.14.0 bitops_1.0-5 BSgenome_1.26.1 DBI_0.2-5 digest_0.6.0 evaluate_0.4.2 GenomicFeatures_1.10.1 httr_0.2 [11] memoise_0.1 parallel_2.15.1 plyr_1.7.1 RCurl_1.95-3 RSQLite_0.11.2 rtracklayer_1.18.1 stats4_2.15.1 stringr_0.6.1 tools_2.15.1 whisker_0.1 [21] XML_3.95-0.1 zlibbioc_1.4.0
• 638 views
ADD COMMENTlink modified 6.3 years ago by Michael Lawrence10k • written 6.3 years ago by Richard Pearson60
Answer: writeVcf bug
0
gravatar for Michael Lawrence
6.3 years ago by
United States
Michael Lawrence10k wrote:
I am pretty sure this bug has been fixed in devel. Would you mind pasting your devel session info? Thanks, Michael On Thu, Nov 29, 2012 at 8:44 AM, Richard Pearson <rpearson@well.ox.ac.uk>wrote: > Hi > > I've noticed a bug with writeVcf when trying to write out relatively > simple vcf files, e.g. a file containing only GT in the FORMAT fields. The > following is a (hopefully) reproducible example based on examples in > ?writeVcf > > library(VariantAnnotation) > fl <- system.file("extdata", "ex2.vcf", package="VariantAnnotation") > out1.vcf <- tempfile() > out2.vcf <- tempfile() > in1 <- readVcf(fl, "hg19") > writeVcf(in1, out1.vcf) > in2 <- readVcf(fl, "hg19", ScanVcfParam(geno="GT")) > writeVcf(in2, out2.vcf) > > This last line gives me: > > Error in row(genoMat) : > a matrix-like object is required as argument to 'row' > > Looking at the following code snippet from .makeVcfGeno, it seems genoMat > is first getting created as a matrix, but then changed to something that is > not a matrix by "genoMat <- genoMatFlat". The error then comes from the > "row(genoMat)" on the last line shown. I don't really understand what is > going on in this code so can't offer a patch I'm afraid. > > genoMat <- matrix(unlist(as.list(geno), use.names = FALSE, > recursive = FALSE), nsub * nrec, length(geno)) > genoMatFlat <- as.character(unlist(genoMat)) > genoMatFlat[is.na(genoMatFlat)**] <- "." > if (is.list(genoMat)) { > genoMatList <- relist(genoMatFlat, PartitioningByEnd(genoMat)) > genoMatFlat <- .pasteCollapse(genoMatList, ",") > genoMat <- matrix(genoMatFlat, nrow(genoMat), ncol(genoMat)) > } > else genoMat <- genoMatFlat > formatMatPerSub <- matrix(rep(t(formatMat), nsub), nsub * > nrec, length(geno), byrow = TRUE) > keep <- !is.na(formatMatPerSub) > genoListBySub <- seqsplit(genoMat[keep], row(genoMat)[keep]) > > FWIW, as a temporary fix I made an extra, slightly more complex FORMAT > field using the following: > > geno(in2)[["DUMMY"]] <- matrix(sapply(as.vector(geno(**in2)[["GT"]]), > function(x) list(c(x, x))), ncol=ncol(geno(in2)[["GT"]]), > dimnames=dimnames(geno(in2)[["**GT"]])) > writeVcf(in2, out2.vcf) > > I've checked and it seems this bug is also present in latest devel. Any > chance of a fix? > > Thanks > > Richard > > > sessionInfo() > R version 2.15.1 (2012-06-22) > Platform: x86_64-unknown-linux-gnu (64-bit) > > locale: > [1] LC_CTYPE=en_GB.UTF-8 LC_NUMERIC=C LC_TIME=en_GB.UTF-8 > LC_COLLATE=en_GB.UTF-8 LC_MONETARY=en_GB.UTF-8 LC_MESSAGES=en_GB.UTF-8 > LC_PAPER=C LC_NAME=C > [9] LC_ADDRESS=C LC_TELEPHONE=C LC_MEASUREMENT=en_GB.UTF-8 > LC_IDENTIFICATION=C > > attached base packages: > [1] stats graphics grDevices utils datasets methods base > > other attached packages: > [1] VariantAnnotation_1.4.5 Rsamtools_1.10.2 Biostrings_2.26.2 > GenomicRanges_1.10.5 IRanges_1.16.4 BiocGenerics_0.4.0 > devtools_0.8 BiocInstaller_1.8.3 > > loaded via a namespace (and not attached): > [1] AnnotationDbi_1.20.3 Biobase_2.18.0 biomaRt_2.14.0 > bitops_1.0-5 BSgenome_1.26.1 DBI_0.2-5 digest_0.6.0 > evaluate_0.4.2 GenomicFeatures_1.10.1 httr_0.2 > [11] memoise_0.1 parallel_2.15.1 plyr_1.7.1 > RCurl_1.95-3 RSQLite_0.11.2 rtracklayer_1.18.1 stats4_2.15.1 > stringr_0.6.1 tools_2.15.1 whisker_0.1 > [21] XML_3.95-0.1 zlibbioc_1.4.0 > > ______________________________**_________________ > Bioconductor mailing list > Bioconductor@r-project.org > https://stat.ethz.ch/mailman/**listinfo/bioconductor<https: stat.et="" hz.ch="" mailman="" listinfo="" bioconductor=""> > Search the archives: http://news.gmane.org/gmane.** > science.biology.informatics.**conductor<http: news.gmane.org="" gmane.="" science.biology.informatics.conductor=""> > [[alternative HTML version deleted]]
ADD COMMENTlink written 6.3 years ago by Michael Lawrence10k
Hi Michael My devel sessionInfo is below. It seems biocLite has given me version 1.5.17of the package, but I see that .makeVcfGenolooks different in the latest from SVN (1.5.19) - so I'm sure you're right and I just need to wait for things to filter through. Apologies for the noise. Every time I upgrade VariantAnnotation it is an even better package than before - many thanks for all the hard work! Richard > sessionInfo() R Under development (unstable) (2012-11-27 r61172) Platform: x86_64-unknown-linux-gnu (64-bit) locale: [1] LC_CTYPE=en_GB.UTF-8 LC_NUMERIC=C LC_TIME=en_GB.UTF-8 LC_COLLATE=en_GB.UTF-8 LC_MONETARY=en_GB.UTF-8 LC_MESSAGES=en_GB.UTF-8 LC_PAPER=C LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] parallel stats graphics grDevices utils datasets methods base other attached packages: [1] VariantAnnotation_1.5.17 Rsamtools_1.11.11 Biostrings_2.27.7 GenomicRanges_1.11.15 IRanges_1.17.15 BiocGenerics_0.5.1 devtools_0.8 BiocInstaller_1.9.4 loaded via a namespace (and not attached): [1] AnnotationDbi_1.21.7 Biobase_2.19.1 biomaRt_2.15.0 bitops_1.0-5 BSgenome_1.27.1 DBI_0.2-5 digest_0.6.0 evaluate_0.4.2 GenomicFeatures_1.11.5 httr_0.2 [11] memoise_0.1 plyr_1.7.1 RCurl_1.95-3 RSQLite_0.11.2 rtracklayer_1.19.6 stats4_2.16.0 stringr_0.6.1 tools_2.16.0 whisker_0.1 XML_3.95-0.1 [21] zlibbioc_1.5.0 On 29/11/2012 17:22, Michael Lawrence wrote: > I am pretty sure this bug has been fixed in devel. Would you mind > pasting your devel session info? > > Thanks, > Michael > > > On Thu, Nov 29, 2012 at 8:44 AM, Richard Pearson > <rpearson@well.ox.ac.uk <mailto:rpearson@well.ox.ac.uk="">> wrote: > > Hi > > I've noticed a bug with writeVcf when trying to write out > relatively simple vcf files, e.g. a file containing only GT in the > FORMAT fields. The following is a (hopefully) reproducible example > based on examples in ?writeVcf > > library(VariantAnnotation) > fl <- system.file("extdata", "ex2.vcf", package="VariantAnnotation") > out1.vcf <- tempfile() > out2.vcf <- tempfile() > in1 <- readVcf(fl, "hg19") > writeVcf(in1, out1.vcf) > in2 <- readVcf(fl, "hg19", ScanVcfParam(geno="GT")) > writeVcf(in2, out2.vcf) > > This last line gives me: > > Error in row(genoMat) : > a matrix-like object is required as argument to 'row' > > Looking at the following code snippet from .makeVcfGeno, it seems > genoMat is first getting created as a matrix, but then changed to > something that is not a matrix by "genoMat <- genoMatFlat". The > error then comes from the "row(genoMat)" on the last line shown. I > don't really understand what is going on in this code so can't > offer a patch I'm afraid. > > genoMat <- matrix(unlist(as.list(geno), use.names = FALSE, > recursive = FALSE), nsub * nrec, length(geno)) > genoMatFlat <- as.character(unlist(genoMat)) > genoMatFlat[is.na <http: is.na="">(genoMatFlat)] <- "." > if (is.list(genoMat)) { > genoMatList <- relist(genoMatFlat, PartitioningByEnd(genoMat)) > genoMatFlat <- .pasteCollapse(genoMatList, ",") > genoMat <- matrix(genoMatFlat, nrow(genoMat), ncol(genoMat)) > } > else genoMat <- genoMatFlat > formatMatPerSub <- matrix(rep(t(formatMat), nsub), nsub * > nrec, length(geno), byrow = TRUE) > keep <- !is.na <http: is.na="">(formatMatPerSub) > genoListBySub <- seqsplit(genoMat[keep], row(genoMat)[keep]) > > FWIW, as a temporary fix I made an extra, slightly more complex > FORMAT field using the following: > > geno(in2)[["DUMMY"]] <- > matrix(sapply(as.vector(geno(in2)[["GT"]]), function(x) list(c(x, > x))), ncol=ncol(geno(in2)[["GT"]]), > dimnames=dimnames(geno(in2)[["GT"]])) > writeVcf(in2, out2.vcf) > > I've checked and it seems this bug is also present in latest > devel. Any chance of a fix? > > Thanks > > Richard > > > sessionInfo() > R version 2.15.1 (2012-06-22) > Platform: x86_64-unknown-linux-gnu (64-bit) > > locale: > [1] LC_CTYPE=en_GB.UTF-8 LC_NUMERIC=C LC_TIME=en_GB.UTF-8 > LC_COLLATE=en_GB.UTF-8 LC_MONETARY=en_GB.UTF-8 > LC_MESSAGES=en_GB.UTF-8 LC_PAPER=C LC_NAME=C > [9] LC_ADDRESS=C LC_TELEPHONE=C > LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C > > attached base packages: > [1] stats graphics grDevices utils datasets methods base > > other attached packages: > [1] VariantAnnotation_1.4.5 Rsamtools_1.10.2 Biostrings_2.26.2 > GenomicRanges_1.10.5 IRanges_1.16.4 BiocGenerics_0.4.0 > devtools_0.8 BiocInstaller_1.8.3 > > loaded via a namespace (and not attached): > [1] AnnotationDbi_1.20.3 Biobase_2.18.0 biomaRt_2.14.0 > bitops_1.0-5 BSgenome_1.26.1 DBI_0.2-5 digest_0.6.0 > evaluate_0.4.2 GenomicFeatures_1.10.1 httr_0.2 > [11] memoise_0.1 parallel_2.15.1 plyr_1.7.1 > RCurl_1.95-3 RSQLite_0.11.2 rtracklayer_1.18.1 > stats4_2.15.1 stringr_0.6.1 tools_2.15.1 whisker_0.1 > [21] XML_3.95-0.1 zlibbioc_1.4.0 > > _______________________________________________ > Bioconductor mailing list > Bioconductor@r-project.org <mailto:bioconductor@r-project.org> > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > > [[alternative HTML version deleted]]
ADD REPLYlink written 6.3 years ago by Richard Pearson60
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 380 users visited in the last hour