gmapR fills my tmp directory
1
0
Entering edit mode
@seantaylor-7638
Last seen 6.2 years ago
United States

I am attempting to create a GmapGenome index for hg19. Here is the code I am using:

library(BSgenome.Hsapiens.UCSC.hg19)
gmapGenomePath <- '/data/staylo/ref/ucsc/hg19/genome/gsnap/'
gmapGenomeDirectory <- GmapGenomeDirectory(gmapGenomePath, create=TRUE)
hg19 <- GmapGenome(genome=Hsapiens,
                   directory=gmapGenomeDirectory,
                   name='hg19',
                   create=TRUE, k=12L
                   )

The code exits with non-0 status: 

Writing 16777217 offsets to file with total of 965773859 k-mers...done
Running cat ./hg19.genomecomp | /home/staylo/R/x86_64-redhat-linux-gnu-library/3.2/gmapR/usr/bin//gmapindex -b 12 -k 12 -q 3  -d hg19 -F . -D . -P
Looking for index files in directory . (offsets not compressed)
  Offsets file is hg19.ref123offsets
  Positions file is hg19.ref123positions
cat: write error: Broken pipe
sh: line 1:  9040 Done(1)                 cat ./hg19.genomecomp
      9041 Segmentation fault      (core dumped) | /home/staylo/R/x86_64-redhat-linux-gnu-library/3.2/gmapR/usr/bin//gmapindex -b 12 -k 12 -q 3 -d hg19 -F . -D . -P
cat ./hg19.genomecomp | /home/staylo/R/x86_64-redhat-linux-gnu-library/3.2/gmapR/usr/bin//gmapindex -b 12 -k 12 -q 3  -d hg19 -F . -D . -P failed with return code 35584 at /home/staylo/R/x86_64-redhat-linux-gnu-library/3.2/gmapR/usr/bin/gmap_build line 259.
Error in .gmap_build(db = genome(genome), dir = path(directory(genome)),  : 
  system call returned a non-0 status: /home/staylo/R/x86_64-redhat-linux-gnu-library/3.2/gmapR/usr/bin/gmap_build --db=hg19 --dir=/data/staylo/ref/ucsc/hg19/genome/gsnap --kmer=12 --sort=none --circular=chrM -B /home/staylo/R/x86_64-redhat-linux-gnu-library/3.2/gmapR/usr/bin/ /tmp/RtmpsJnTvJ/gmap_build_fasta1efb63a8be94

 

I believe this is because it has filled up my tmp directory which is currently only 5GB on the system I am using. If I index something smaller, say chrX or even chr2, the code completes. I am currently working with our IT to increase the size of tmp. But in the mean time, I was wondering if there was an argument I could pass that would allow me to specify a different location for tmp, for instance to a larger data partition. 

Thanks,

Sean

> sessionInfo()
R version 3.2.2 (2015-08-14)
Platform: x86_64-redhat-linux-gnu (64-bit)
Running under: CentOS Linux 7 (Core)

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C               LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8     LC_MONETARY=en_US.UTF-8   
 [6] LC_MESSAGES=en_US.UTF-8    LC_PAPER=en_US.UTF-8       LC_NAME=C                  LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats4    parallel  stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] BSgenome.Hsapiens.UCSC.hg19_1.4.0 BSgenome_1.38.0                   rtracklayer_1.30.1                dplyr_0.4.3                      
 [5] Rbowtie_1.10.0                    ShortRead_1.28.0                  GenomicAlignments_1.6.1           BiocParallel_1.4.3               
 [9] sangerseqR_1.6.0                  seqTools_1.4.1                    zlibbioc_1.16.0                   sagenhaft_1.40.0                 
[13] SparseM_1.7                       gmapR_1.12.0                      VariantAnnotation_1.16.4          Rsamtools_1.22.0                 
[17] SummarizedExperiment_1.0.1        Biobase_2.30.0                    GenomicRanges_1.22.2              GenomeInfoDb_1.6.1               
[21] msa_1.2.1                         Biostrings_2.38.2                 XVector_0.10.0                    IRanges_2.4.6                    
[25] S4Vectors_0.8.5                   BiocGenerics_0.16.1              

loaded via a namespace (and not attached):
 [1] Rcpp_0.12.2            RColorBrewer_1.1-2     futile.logger_1.4.1    GenomicFeatures_1.22.7 bitops_1.0-6           futile.options_1.0.0  
 [7] tools_3.2.2            biomaRt_2.26.1         digest_0.6.8           lattice_0.20-33        RSQLite_1.0.0          shiny_0.12.2          
[13] DBI_0.3.1              hwriter_1.3.2          grid_3.2.2             R6_2.1.1               AnnotationDbi_1.32.3   XML_3.98-1.3          
[19] latticeExtra_0.6-26    magrittr_1.5           lambda.r_1.1.7         htmltools_0.2.6        assertthat_0.1         xtable_1.8-0          
[25] mime_0.4               httpuv_1.3.3           RCurl_1.95-4.7     
gmapr gmapgenome • 1.7k views
ADD COMMENT
2
Entering edit mode
@michael-lawrence-3846
Last seen 3.0 years ago
United States

Should be able to set your TMPDIR environment variable.

ADD COMMENT
0
Entering edit mode

Thought of that and checked with my IT support, but that environment variable is shared globally by the other users, so it would reroute more than I bargained for. If that is the only solution, then I can work with our IT support to setup a bigger tmp location. I was hoping there might be a software solution where I could specify it as a one-off for just this command. 

ADD REPLY
0
Entering edit mode

You can set that variable at the process level with Sys.setenv()

ADD REPLY
0
Entering edit mode

Tried that too and kept getting the old tmp directory. I guess I will have to wait for IT to build a bigger tmp.

ADD REPLY
1
Entering edit mode

Sorry, it looks like you need to set that when R is started. So just start R at the shell with:

TMPDIR="/my/tmp/dir" R
ADD REPLY
0
Entering edit mode

Aha! I had tried that as well with no luck, but I think I may have been pointing it to a directory that was not writable. Once I pointed it to a valid location it started to work and I was able to complete the indexing. Thanks for the help!

ADD REPLY

Login before adding your answer.

Traffic: 541 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6