Question: A problem with csaw that I don't think is a problem with csaw
0
gravatar for sorjuelal
12 months ago by
sorjuelal0
sorjuelal0 wrote:

Hello,

I updated to R 3.5.1 on an ubuntu 16.04.5 LTS machine. I am using bioconductor 3.8 packages. Once I run windowCounts() from csaw I get a *** caught segfault *** address 0x7, cause 'memory not mapped'.

Here is example code using a bam file from the csaw package:

> library(csaw)
> pe.bam <- system.file("exdata", "pet.bam", package = "csaw")
> pe.param <- readParam(max.frag=400, pe="both")
> demo <- windowCounts(pe.bam, ext=250, param=pe.param)

 *** caught segfault ***
address 0x7, cause 'memory not mapped'

Traceback:
 1: .getPairedEnd(bam.file, where = where, param = param)
 2: (function (bam.file, where, param, init.ext, final.ext, outlen,     bin, shift, width, spacing, total.pts, at.start) {    if (param$pe != "both") {        reads <- .getSingleEnd(bam.file, where = where, param = param)        extended <- .extendSE(reads, ext = init.ext, final = final.ext,             chrlen = outlen)        frag.start <- extended$start        frag.end <- extended$end        rlengths <- cbind(c(mean(reads$forward$qwidth), mean(reads$reverse$qwidth)),             c(length(reads$forward$qwidth), length(reads$reverse$qwidth)))    }    else {        out <- .getPairedEnd(bam.file, where = where, param = param)        if (bin) {            mid <- as.integer(out$pos + out$size/2)            mid <- pmin(mid, outlen)            frag.end <- frag.start <- mid        }        else {            checked <- .coerceFragments(out$pos, out$pos + out$size -                 1L, final = final.ext, chrlen = outlen)            frag.start <- checked$start            frag.end <- checked$end        }        rlengths <- c(mean(out$size), length(out$size))    }    right <- width - shift - 1L    left <- shift    frag.start <- frag.start - right    frag.end <- frag.end + left    out <- .Call(cxx_get_rle_counts, frag.start, frag.end, total.pts,         spacing, at.start)    return(list(counts = out, totals = length(frag.start), lengths = rlengths))})(bam.file = dots[[1L]][[1L]], init.ext = dots[[2L]][[1L]],     where = new("GRanges", seqnames = new("Rle", values = 1L,         lengths = 1L, elementMetadata = NULL, metadata = list()),         ranges = new("IRanges", start = 1L, width = 200L, NAMES = NULL,             elementType = "ANY", elementMetadata = NULL, metadata = list()),         strand = new("Rle", values = 3L, lengths = 1L, elementMetadata = NULL,             metadata = list()), seqinfo = new("Seqinfo", seqnames = "chrA",             seqlengths = NA_integer_, is_circular = NA, genome = NA_character_),         elementMetadata = new("DataFrame", rownames = NULL, nrows = 1L,             listData = list(), elementType = "ANY", elementMetadata = NULL,             metadata = list()), elementType = "ANY", metadata = list()),     param = new("readParam", pe = "both", max.frag = 400L, dedup = FALSE,         minq = NA_integer_, forward = NA, restrict = character(0),         discard = new("GRanges", seqnames = new("Rle", values = integer(0),             lengths = integer(0), elementMetadata = NULL, metadata = list()),             ranges = new("IRanges", start = integer(0), width = integer(0),                 NAMES = NULL, elementType = "ANY", elementMetadata = NULL,                 metadata = list()), strand = new("Rle", values = integer(0),                 lengths = integer(0), elementMetadata = NULL,                 metadata = list()), seqinfo = new("Seqinfo",                 seqnames = character(0), seqlengths = integer(0),                 is_circular = logical(0), genome = character(0)),             elementMetadata = new("DataFrame", rownames = NULL,                 nrows = 0L, listData = list(), elementType = "ANY",                 elementMetadata = NULL, metadata = list()), elementType = "ANY",             metadata = list()), BPPARAM = new("SerialParam",             .xData = <environment>)), final.ext = NA_integer_,     outlen = c(chrA = 200L), bin = FALSE, shift = 0L, width = 50L,     spacing = 50L, total.pts = 4L, at.start = TRUE)
 3: .mapply(.FUN, dots, .MoreArgs)
 4: FUN(...)
 5: doTryCatch(return(expr), name, parentenv, handler)
 6: tryCatchOne(expr, names, parentenv, handlers[[1L]])
 7: tryCatchList(expr, classes, parentenv, handlers)
 8: tryCatch({    FUN(...)}, error = handle_error)
 9: withCallingHandlers({    tryCatch({        FUN(...)    }, error = handle_error)}, warning = handle_warning)
10: FUN(...)
11: FUN(X[[i]], ...)
12: lapply(X, FUN_, ...)
13: bplapply(X = seq_along(ddd[[1L]]), wrap, .FUN = FUN, .ddd = ddd,     .MoreArgs = MoreArgs, BPREDO = BPREDO, BPPARAM = BPPARAM)
14: bplapply(X = seq_along(ddd[[1L]]), wrap, .FUN = FUN, .ddd = ddd,     .MoreArgs = MoreArgs, BPREDO = BPREDO, BPPARAM = BPPARAM)
15: bpmapply(FUN = .window_counts, bam.file = bam.files, init.ext = ext.data$ext,     MoreArgs = list(where = where, param = param, final.ext = ext.data$final,         outlen = outlen, bin = bin, shift = shift, width = width,         spacing = spacing, total.pts = total.pts, at.start = at.start),     BPPARAM = param$BPPARAM, SIMPLIFY = FALSE)
16: bpmapply(FUN = .window_counts, bam.file = bam.files, init.ext = ext.data$ext,     MoreArgs = list(where = where, param = param, final.ext = ext.data$final,         outlen = outlen, bin = bin, shift = shift, width = width,         spacing = spacing, total.pts = total.pts, at.start = at.start),     BPPARAM = param$BPPARAM, SIMPLIFY = FALSE)
17: windowCounts(pe.bam, ext = 250, param = pe.param)
An irrecoverable exception occurred. R is aborting now ...
Segmentation fault (core dumped)

Here is my session info:

R version 3.5.1 (2018-07-02)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 16.04.5 LTS

Matrix products: default
BLAS: /usr/local/R/R-3.5.1/lib/libRblas.so
LAPACK: /usr/local/R/R-3.5.1/lib/libRlapack.so

locale:
 [1] LC_CTYPE=en_CA.UTF-8    LC_NUMERIC=C            LC_TIME=C              
 [4] LC_COLLATE=en_CA.UTF-8  LC_MONETARY=C           LC_MESSAGES=en_CA.UTF-8
 [7] LC_PAPER=C              LC_NAME=C               LC_ADDRESS=C           
[10] LC_TELEPHONE=C          LC_MEASUREMENT=C        LC_IDENTIFICATION=C    

attached base packages:
[1] parallel  stats4    stats     graphics  grDevices utils     datasets 
[8] methods   base     

other attached packages:
 [1] csaw_1.16.0                 SummarizedExperiment_1.12.0
 [3] DelayedArray_0.8.0          BiocParallel_1.16.1        
 [5] matrixStats_0.54.0          Biobase_2.42.0             
 [7] GenomicRanges_1.34.0        GenomeInfoDb_1.18.1        
 [9] IRanges_2.16.0              S4Vectors_0.20.1           
[11] BiocGenerics_0.28.0        

loaded via a namespace (and not attached):
 [1] Rcpp_1.0.0               compiler_3.5.1           XVector_0.22.0          
 [4] prettyunits_1.0.2        GenomicFeatures_1.34.1   bitops_1.0-6            
 [7] tools_3.5.1              zlibbioc_1.28.0          progress_1.2.0          
[10] biomaRt_2.38.0           digest_0.6.18            bit_1.1-14              
[13] RSQLite_2.1.1            memoise_1.1.0            lattice_0.20-38         
[16] pkgconfig_2.0.2          rlang_0.3.0.1            Matrix_1.2-15           
[19] DBI_1.0.0                GenomeInfoDbData_1.2.0   rtracklayer_1.42.1      
[22] httr_1.3.1               stringr_1.3.1            hms_0.4.2               
[25] Biostrings_2.50.1        locfit_1.5-9.1           bit64_0.9-7             
[28] grid_3.5.1               R6_2.3.0                 AnnotationDbi_1.44.0    
[31] XML_3.98-1.16            limma_3.38.2             edgeR_3.24.0            
[34] magrittr_1.5             blob_1.1.1               GenomicAlignments_1.18.0
[37] Rsamtools_1.34.0         assertthat_0.2.0         stringi_1.2.4           
[40] RCurl_1.95-4.11          crayon_1.3.4

Any input is appreciated,

Stephany  

csaw caught segfault • 250 views
ADD COMMENTlink modified 12 months ago • written 12 months ago by sorjuelal0
Answer: A problem with csaw that I don't think is a problem with csaw
0
gravatar for Aaron Lun
12 months ago by
Aaron Lun25k
Cambridge, United Kingdom
Aaron Lun25k wrote:

That does not look good. It seems like a problem with some of the C++ code for handling paired-end reads. The question is whether this is due to code that I wrote or whether it is due to the HTS library. Can you:

  1. Copy the above example code into a separate R script, e.g., blah.R, and
  2. From the command line, run R CMD BATCH -d valgrind --no-save blah.R, and
  3. Show the output of the resulting blah.Rout file?

You should already have valgrind installed if you're on an Ubuntu system. 

ADD COMMENTlink written 12 months ago by Aaron Lun25k
Answer: A problem with csaw that I don't think is a problem with csaw
0
gravatar for sorjuelal
12 months ago by
sorjuelal0
sorjuelal0 wrote:

The first half:

> pe.bam <- system.file("exdata", "pet.bam", package = "csaw")
> pe.param <- readParam(max.frag=400, pe="both")
> demo <- windowCounts(pe.bam, ext=250, param=pe.param)
==46718== Invalid read of size 4
==46718==    at 0x42A52870: bam_cigar2rlen (sam.c:331)
==46718==    by 0x427FBB7E: BamRead::extract_data(AlignData&) const (bam_utils.cpp:54)
==46718==    by 0x42817D31: extract_pair_data (pair_reads.cpp:129)
==46718==    by 0x4F2D709: R_doDotCall (dotcode.c:596)
==46718==    by 0x4F67A20: bcEval (eval.c:7289)
==46718==    by 0x4F7137F: Rf_eval (eval.c:624)
==46718==    by 0x4F72CE0: R_execClosure (eval.c:1773)
==46718==    by 0x4F68A16: bcEval (eval.c:6749)
==46718==    by 0x4F7137F: Rf_eval (eval.c:624)
==46718==    by 0x4F72CE0: R_execClosure (eval.c:1773)
==46718==    by 0x4F75C85: R_forceAndCall (eval.c:1840)
==46718==    by 0x4FA2E26: do_mapply (mapply.c:105)
==46718==  Address 0x7 is not stack'd, malloc'd or (recently) free'd
==46718== 

 *** caught segfault ***
address 0x7, cause 'memory not mapped'

Traceback:
 1: .getPairedEnd(bam.file, where = where, param = param)
 ...
ADD COMMENTlink modified 12 months ago • written 12 months ago by sorjuelal0

That's all I needed to know. Looks like it's a failure with parsing the CIGAR string... though I don't know how this could happen. Please contact me offline at the csaw maintainer address if you want to be involved in further debugging - I will create a modified version of the package for you to run to see whether the same error is triggered.

ADD REPLYlink written 12 months ago by Aaron Lun25k

FYI if you install a debugger version of the package:

BiocManager::install("LTLA/csaw@debugger")

... the example will print the read name and cigar length up to the point that it fails. This'll tell us which read is the problem.

ADD REPLYlink modified 12 months ago • written 12 months ago by Aaron Lun25k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 211 users visited in the last hour