gmapR problem using a GmapGenome package with gsnap()
1
0
Entering edit mode
Robert Castelo ★ 3.3k
@rcastelo
Last seen 5 days ago
Barcelona/Universitat Pompeu Fabra
hi, using the gmapR package i built the GmapGenome package from the Bioconductor hg19 human genome package as follows: library(gmapR) library(BSgenome.Hsapiens.UCSC.hg19) gmapGenomePath <- file.path("./HsapGenome") gmapGenomeDirectory <- GmapGenomeDirectory(gmapGenomePath, create=TRUE) gmapGenome <- GmapGenome(genome=Hsapiens, directory=gmapGenomeDirectory, name="hg19", create=TRUE) makeGmapGenomePackage(gmapGenome=gmapGenome, version="1.0.0", maintainer="<robert.castelo at="" upf.edu="">", author="Robert Castelo", destDir=".", license="Artistic-2.0", pkgName="GmapGenome.Hsapiens.UCSC.hg19") afterwards, i bundled the package from the shell with: $ R CMD build GmapGenome.Hsapiens.UCSC.hg19 and i install it with $ R CMD INSTALL GmapGenome.Hsapiens.UCSC.hg19_1.0.0.tar.gz when i try to use the package in a call to gsnap() it seems that there is no way to pass the *name* of the GmapGenome object to the command-line that works under the hoods, i.e.: library(GmapGenome.Hsapiens.UCSC.hg19) gsnapParam <- GsnapParam(genome=GmapGenome.Hsapiens.UCSC.hg19, unique_only=TRUE, nthreads=8) gsnapOutput <- gsnap(input_a=fastqs[1], input_b=NULL, params=gsnapParam) Error in .system_gsnap(commandLine("gsnap")) : Execution of gsnap failed, command-line: '/opt/R/R-3.0.0/lib64/R/aprilLibrary/gmapR/usr/bin/gsnap --db=GmapGenome.Hsapiens.UCSC.hg19 --dir=/opt/R/R-3.0.0/lib64/R/aprilLibrary/GmapGenome.Hsapiens.UCSC.hg1 9/extdata --nthreads=8 --npaths=1 --quiet-if-excessive --nofails --format=sam f.fastq > f.sam 2>&1'; last output line: '' the error is not very explanatory, but basically the command-line argument "--db=GmapGenome.Hsapiens.UCSC.hg19" should have been "--db=hg19". if i proceed that way in command-line, then it works. i've looked at the arguments of gsnap() and GsnapParam() but found no way to specify the genome object *name* that it would go to the "--db" parameter. i guess it should be a matter for the function GsnapParam() to fetch the prefix of the files in list.files(system.file("extdata", package="GmapGenome.Hsapiens.UCSC.hg19")) [1] "hg19.chromosome" "hg19.chromosome.iit" "hg19.chrsubset" "hg19.contig" [5] "hg19.contig.iit" "hg19.genomecomp" "hg19.ref12153gammaptrs" "hg19.ref12153offsetscomp" [9] "hg19.ref12153positions" "hg19.version" and plug it into the command-line call. by the way, not related but another but i incidentally found is then one tries to pass one *single* end read gzipped fastq file. library(LungCancerLines) fastqs <- LunCancerFastqFiles() gsnapOutput <- gsnap(input_a=fastqs[1], input_b=NULL, params=gsnapParam) Error in if (file_ext(input_a) == "gz" && file_ext(input_b) == "gz") { : missing value where TRUE/FALSE needed cheers, robert. ps: sessionInfo() R version 3.0.0 (2013-04-03) Platform: x86_64-unknown-linux-gnu (64-bit) locale: [1] LC_CTYPE=en_US.UTF8 LC_NUMERIC=C LC_TIME=en_US.UTF8 LC_COLLATE=en_US.UTF8 [5] LC_MONETARY=en_US.UTF8 LC_MESSAGES=en_US.UTF8 LC_PAPER=C LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C LC_MEASUREMENT=en_US.UTF8 LC_IDENTIFICATION=C attached base packages: [1] parallel stats graphics grDevices utils datasets methods base other attached packages: [1] GmapGenome.Hsapiens.UCSC.hg19_1.0.0 LungCancerLines_0.0.8 BSgenome.Hsapiens.UCSC.hg19_1.3.19 [4] BSgenome_1.28.0 gmapR_1.2.0 ShortRead_1.18.0 [7] latticeExtra_0.6-24 RColorBrewer_1.0-5 Rsamtools_1.12.3 [10] lattice_0.20-15 Biostrings_2.28.0 GenomicRanges_1.12.3 [13] IRanges_1.18.1 BiocGenerics_0.6.0 vimcom_0.9-8 [16] setwidth_1.0-3 colorout_1.0-0 loaded via a namespace (and not attached): [1] AnnotationDbi_1.22.5 Biobase_2.20.0 biomaRt_2.16.0 bitops_1.0-5 [5] DBI_0.2-7 GenomicFeatures_1.12.1 grid_3.0.0 hwriter_1.3 [9] RCurl_1.95-4.1 RSQLite_0.11.3 rtracklayer_1.20.2 stats4_3.0.0 [13] tools_3.0.0 VariantAnnotation_1.6.5 XML_3.96-1.1 zlibbioc_1.6.0
GO BSgenome BSgenome gmapR GO BSgenome BSgenome gmapR • 1.3k views
ADD COMMENT
0
Entering edit mode
@michael-lawrence-3846
Last seen 2.4 years ago
United States
Hi Robert, Thanks for the useful report. I think I've fixed the issue in makeGmapGenomePackage. If you grab 1.3.2 (devel) and rebuild your package, it should work. Michael On Wed, May 15, 2013 at 8:59 AM, Robert Castelo <robert.castelo@upf.edu>wrote: > hi, > > using the gmapR package i built the GmapGenome package from the > Bioconductor hg19 human genome package as follows: > > library(gmapR) > library(BSgenome.Hsapiens.**UCSC.hg19) > gmapGenomePath <- file.path("./HsapGenome") > gmapGenomeDirectory <- GmapGenomeDirectory(**gmapGenomePath, create=TRUE) > gmapGenome <- GmapGenome(genome=Hsapiens, directory=gmapGenomeDirectory, > name="hg19", create=TRUE) > > makeGmapGenomePackage(**gmapGenome=gmapGenome, > version="1.0.0", > maintainer="<robert.castelo@**upf.edu<robert.castelo@upf.edu> > >", > author="Robert Castelo", > destDir=".", > license="Artistic-2.0", > pkgName="GmapGenome.Hsapiens.**UCSC.hg19") > > afterwards, i bundled the package from the shell with: > > $ R CMD build GmapGenome.Hsapiens.UCSC.hg19 > > and i install it with > > $ R CMD INSTALL GmapGenome.Hsapiens.UCSC.hg19_**1.0.0.tar.gz > > when i try to use the package in a call to gsnap() it seems that there is > no way to pass the *name* of the GmapGenome object to the command- line that > works under the hoods, i.e.: > > library(GmapGenome.Hsapiens.**UCSC.hg19) > > gsnapParam <- GsnapParam(genome=GmapGenome.**Hsapiens.UCSC.hg19, > unique_only=TRUE, nthreads=8) > > gsnapOutput <- gsnap(input_a=fastqs[1], input_b=NULL, params=gsnapParam) > Error in .system_gsnap(commandLine("**gsnap")) : > Execution of gsnap failed, command-line: '/opt/R/R-3.0.0/lib64/R/** > aprilLibrary/gmapR/usr/bin/**gsnap --db=GmapGenome.Hsapiens.UCSC.**hg19 > --dir=/opt/R/R-3.0.0/lib64/R/**aprilLibrary/GmapGenome.**Hsapiens.UC SC.hg19/extdata > --nthreads=8 --npaths=1 --quiet-if-excessive --nofails > --format=sam f.fastq > f.sam 2>&1'; last output line: '' > > the error is not very explanatory, but basically the command-line argument > "--db=GmapGenome.Hsapiens.**UCSC.hg19" should have been "--db=hg19". if i > proceed that way in command-line, then it works. > > i've looked at the arguments of gsnap() and GsnapParam() but found no way > to specify the genome object *name* that it would go to the "--db" > parameter. i guess it should be a matter for the function GsnapParam() to > fetch the prefix of the files in > > list.files(system.file("**extdata", package="GmapGenome.Hsapiens.** > UCSC.hg19")) > [1] "hg19.chromosome" "hg19.chromosome.iit" "hg19.chrsubset" > "hg19.contig" > [5] "hg19.contig.iit" "hg19.genomecomp" "hg19.ref12153gammaptrs" > "hg19.ref12153offsetscomp" > [9] "hg19.ref12153positions" "hg19.version" > > and plug it into the command-line call. > > by the way, not related but another but i incidentally found is then one > tries to pass one *single* end read gzipped fastq file. > > library(LungCancerLines) > fastqs <- LunCancerFastqFiles() > > gsnapOutput <- gsnap(input_a=fastqs[1], input_b=NULL, params=gsnapParam) > Error in if (file_ext(input_a) == "gz" && file_ext(input_b) == "gz") { : > missing value where TRUE/FALSE needed > > > cheers, > robert. > ps: sessionInfo() > R version 3.0.0 (2013-04-03) > Platform: x86_64-unknown-linux-gnu (64-bit) > > locale: > [1] LC_CTYPE=en_US.UTF8 LC_NUMERIC=C LC_TIME=en_US.UTF8 > LC_COLLATE=en_US.UTF8 > [5] LC_MONETARY=en_US.UTF8 LC_MESSAGES=en_US.UTF8 LC_PAPER=C > LC_NAME=C > [9] LC_ADDRESS=C LC_TELEPHONE=C LC_MEASUREMENT=en_US.UTF8 > LC_IDENTIFICATION=C > > attached base packages: > [1] parallel stats graphics grDevices utils datasets methods > base > > other attached packages: > [1] GmapGenome.Hsapiens.UCSC.hg19_**1.0.0 LungCancerLines_0.0.8 > BSgenome.Hsapiens.UCSC.hg19_1.**3.19 > [4] BSgenome_1.28.0 gmapR_1.2.0 ShortRead_1.18.0 > [7] latticeExtra_0.6-24 RColorBrewer_1.0-5 > Rsamtools_1.12.3 > [10] lattice_0.20-15 Biostrings_2.28.0 > GenomicRanges_1.12.3 > [13] IRanges_1.18.1 BiocGenerics_0.6.0 > vimcom_0.9-8 > [16] setwidth_1.0-3 colorout_1.0-0 > > loaded via a namespace (and not attached): > [1] AnnotationDbi_1.22.5 Biobase_2.20.0 biomaRt_2.16.0 > bitops_1.0-5 > [5] DBI_0.2-7 GenomicFeatures_1.12.1 grid_3.0.0 > hwriter_1.3 > [9] RCurl_1.95-4.1 RSQLite_0.11.3 rtracklayer_1.20.2 > stats4_3.0.0 > [13] tools_3.0.0 VariantAnnotation_1.6.5 XML_3.96-1.1 > zlibbioc_1.6.0 > > ______________________________**_________________ > Bioconductor mailing list > Bioconductor@r-project.org > https://stat.ethz.ch/mailman/**listinfo/bioconductor<https: stat.et="" hz.ch="" mailman="" listinfo="" bioconductor=""> > Search the archives: http://news.gmane.org/gmane.** > science.biology.informatics.**conductor<http: news.gmane.org="" gmane.="" science.biology.informatics.conductor=""> > [[alternative HTML version deleted]]
ADD COMMENT

Login before adding your answer.

Traffic: 771 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6