Question: A problem with csaw that I don't think is a problem with csaw
0
gravatar for sorjuelal
8 months ago by
sorjuelal0
sorjuelal0 wrote:

Hello,

I updated to R 3.5.1 on an ubuntu 16.04.5 LTS machine. I am using bioconductor 3.8 packages. Once I run windowCounts() from csaw I get a *** caught segfault *** address 0x7, cause 'memory not mapped'.

Here is example code using a bam file from the csaw package:

> library(csaw)
> pe.bam <- system.file("exdata", "pet.bam", package = "csaw")
> pe.param <- readParam(max.frag=400, pe="both")
> demo <- windowCounts(pe.bam, ext=250, param=pe.param)

 *** caught segfault ***
address 0x7, cause 'memory not mapped'

Traceback:
 1: .getPairedEnd(bam.file, where = where, param = param)
 2: (function (bam.file, where, param, init.ext, final.ext, outlen,     bin, shift, width, spacing, total.pts, at.start) {    if (param$pe != "both") {        reads <- .getSingleEnd(bam.file, where = where, param = param)        extended <- .extendSE(reads, ext = init.ext, final = final.ext,             chrlen = outlen)        frag.start <- extended$start        frag.end <- extended$end        rlengths <- cbind(c(mean(reads$forward$qwidth), mean(reads$reverse$qwidth)),             c(length(reads$forward$qwidth), length(reads$reverse$qwidth)))    }    else {        out <- .getPairedEnd(bam.file, where = where, param = param)        if (bin) {            mid <- as.integer(out$pos + out$size/2)            mid <- pmin(mid, outlen)            frag.end <- frag.start <- mid        }        else {            checked <- .coerceFragments(out$pos, out$pos + out$size -                 1L, final = final.ext, chrlen = outlen)            frag.start <- checked$start            frag.end <- checked$end        }        rlengths <- c(mean(out$size), length(out$size))    }    right <- width - shift - 1L    left <- shift    frag.start <- frag.start - right    frag.end <- frag.end + left    out <- .Call(cxx_get_rle_counts, frag.start, frag.end, total.pts,         spacing, at.start)    return(list(counts = out, totals = length(frag.start), lengths = rlengths))})(bam.file = dots[[1L]][[1L]], init.ext = dots[[2L]][[1L]],     where = new("GRanges", seqnames = new("Rle", values = 1L,         lengths = 1L, elementMetadata = NULL, metadata = list()),         ranges = new("IRanges", start = 1L, width = 200L, NAMES = NULL,             elementType = "ANY", elementMetadata = NULL, metadata = list()),         strand = new("Rle", values = 3L, lengths = 1L, elementMetadata = NULL,             metadata = list()), seqinfo = new("Seqinfo", seqnames = "chrA",             seqlengths = NA_integer_, is_circular = NA, genome = NA_character_),         elementMetadata = new("DataFrame", rownames = NULL, nrows = 1L,             listData = list(), elementType = "ANY", elementMetadata = NULL,             metadata = list()), elementType = "ANY", metadata = list()),     param = new("readParam", pe = "both", max.frag = 400L, dedup = FALSE,         minq = NA_integer_, forward = NA, restrict = character(0),         discard = new("GRanges", seqnames = new("Rle", values = integer(0),             lengths = integer(0), elementMetadata = NULL, metadata = list()),             ranges = new("IRanges", start = integer(0), width = integer(0),                 NAMES = NULL, elementType = "ANY", elementMetadata = NULL,                 metadata = list()), strand = new("Rle", values = integer(0),                 lengths = integer(0), elementMetadata = NULL,                 metadata = list()), seqinfo = new("Seqinfo",                 seqnames = character(0), seqlengths = integer(0),                 is_circular = logical(0), genome = character(0)),             elementMetadata = new("DataFrame", rownames = NULL,                 nrows = 0L, listData = list(), elementType = "ANY",                 elementMetadata = NULL, metadata = list()), elementType = "ANY",             metadata = list()), BPPARAM = new("SerialParam",             .xData = <environment>)), final.ext = NA_integer_,     outlen = c(chrA = 200L), bin = FALSE, shift = 0L, width = 50L,     spacing = 50L, total.pts = 4L, at.start = TRUE)
 3: .mapply(.FUN, dots, .MoreArgs)
 4: FUN(...)
 5: doTryCatch(return(expr), name, parentenv, handler)
 6: tryCatchOne(expr, names, parentenv, handlers[[1L]])
 7: tryCatchList(expr, classes, parentenv, handlers)
 8: tryCatch({    FUN(...)}, error = handle_error)
 9: withCallingHandlers({    tryCatch({        FUN(...)    }, error = handle_error)}, warning = handle_warning)
10: FUN(...)
11: FUN(X[[i]], ...)
12: lapply(X, FUN_, ...)
13: bplapply(X = seq_along(ddd[[1L]]), wrap, .FUN = FUN, .ddd = ddd,     .MoreArgs = MoreArgs, BPREDO = BPREDO, BPPARAM = BPPARAM)
14: bplapply(X = seq_along(ddd[[1L]]), wrap, .FUN = FUN, .ddd = ddd,     .MoreArgs = MoreArgs, BPREDO = BPREDO, BPPARAM = BPPARAM)
15: bpmapply(FUN = .window_counts, bam.file = bam.files, init.ext = ext.data$ext,     MoreArgs = list(where = where, param = param, final.ext = ext.data$final,         outlen = outlen, bin = bin, shift = shift, width = width,         spacing = spacing, total.pts = total.pts, at.start = at.start),     BPPARAM = param$BPPARAM, SIMPLIFY = FALSE)
16: bpmapply(FUN = .window_counts, bam.file = bam.files, init.ext = ext.data$ext,     MoreArgs = list(where = where, param = param, final.ext = ext.data$final,         outlen = outlen, bin = bin, shift = shift, width = width,         spacing = spacing, total.pts = total.pts, at.start = at.start),     BPPARAM = param$BPPARAM, SIMPLIFY = FALSE)
17: windowCounts(pe.bam, ext = 250, param = pe.param)
An irrecoverable exception occurred. R is aborting now ...
Segmentation fault (core dumped)

Here is my session info:

R version 3.5.1 (2018-07-02)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 16.04.5 LTS

Matrix products: default
BLAS: /usr/local/R/R-3.5.1/lib/libRblas.so
LAPACK: /usr/local/R/R-3.5.1/lib/libRlapack.so

locale:
 [1] LC_CTYPE=en_CA.UTF-8    LC_NUMERIC=C            LC_TIME=C              
 [4] LC_COLLATE=en_CA.UTF-8  LC_MONETARY=C           LC_MESSAGES=en_CA.UTF-8
 [7] LC_PAPER=C              LC_NAME=C               LC_ADDRESS=C           
[10] LC_TELEPHONE=C          LC_MEASUREMENT=C        LC_IDENTIFICATION=C    

attached base packages:
[1] parallel  stats4    stats     graphics  grDevices utils     datasets 
[8] methods   base     

other attached packages:
 [1] csaw_1.16.0                 SummarizedExperiment_1.12.0
 [3] DelayedArray_0.8.0          BiocParallel_1.16.1        
 [5] matrixStats_0.54.0          Biobase_2.42.0             
 [7] GenomicRanges_1.34.0        GenomeInfoDb_1.18.1        
 [9] IRanges_2.16.0              S4Vectors_0.20.1           
[11] BiocGenerics_0.28.0        

loaded via a namespace (and not attached):
 [1] Rcpp_1.0.0               compiler_3.5.1           XVector_0.22.0          
 [4] prettyunits_1.0.2        GenomicFeatures_1.34.1   bitops_1.0-6            
 [7] tools_3.5.1              zlibbioc_1.28.0          progress_1.2.0          
[10] biomaRt_2.38.0           digest_0.6.18            bit_1.1-14              
[13] RSQLite_2.1.1            memoise_1.1.0            lattice_0.20-38         
[16] pkgconfig_2.0.2          rlang_0.3.0.1            Matrix_1.2-15           
[19] DBI_1.0.0                GenomeInfoDbData_1.2.0   rtracklayer_1.42.1      
[22] httr_1.3.1               stringr_1.3.1            hms_0.4.2               
[25] Biostrings_2.50.1        locfit_1.5-9.1           bit64_0.9-7             
[28] grid_3.5.1               R6_2.3.0                 AnnotationDbi_1.44.0    
[31] XML_3.98-1.16            limma_3.38.2             edgeR_3.24.0            
[34] magrittr_1.5             blob_1.1.1               GenomicAlignments_1.18.0
[37] Rsamtools_1.34.0         assertthat_0.2.0         stringi_1.2.4           
[40] RCurl_1.95-4.11          crayon_1.3.4

Any input is appreciated,

Stephany  

csaw caught segfault • 218 views
ADD COMMENTlink modified 8 months ago • written 8 months ago by sorjuelal0
Answer: A problem with csaw that I don't think is a problem with csaw
0
gravatar for Aaron Lun
8 months ago by
Aaron Lun24k
Cambridge, United Kingdom
Aaron Lun24k wrote:

That does not look good. It seems like a problem with some of the C++ code for handling paired-end reads. The question is whether this is due to code that I wrote or whether it is due to the HTS library. Can you:

  1. Copy the above example code into a separate R script, e.g., blah.R, and
  2. From the command line, run R CMD BATCH -d valgrind --no-save blah.R, and
  3. Show the output of the resulting blah.Rout file?

You should already have valgrind installed if you're on an Ubuntu system. 

ADD COMMENTlink written 8 months ago by Aaron Lun24k
Answer: A problem with csaw that I don't think is a problem with csaw
0
gravatar for sorjuelal
8 months ago by
sorjuelal0
sorjuelal0 wrote:

The first half:

> pe.bam <- system.file("exdata", "pet.bam", package = "csaw")
> pe.param <- readParam(max.frag=400, pe="both")
> demo <- windowCounts(pe.bam, ext=250, param=pe.param)
==46718== Invalid read of size 4
==46718==    at 0x42A52870: bam_cigar2rlen (sam.c:331)
==46718==    by 0x427FBB7E: BamRead::extract_data(AlignData&) const (bam_utils.cpp:54)
==46718==    by 0x42817D31: extract_pair_data (pair_reads.cpp:129)
==46718==    by 0x4F2D709: R_doDotCall (dotcode.c:596)
==46718==    by 0x4F67A20: bcEval (eval.c:7289)
==46718==    by 0x4F7137F: Rf_eval (eval.c:624)
==46718==    by 0x4F72CE0: R_execClosure (eval.c:1773)
==46718==    by 0x4F68A16: bcEval (eval.c:6749)
==46718==    by 0x4F7137F: Rf_eval (eval.c:624)
==46718==    by 0x4F72CE0: R_execClosure (eval.c:1773)
==46718==    by 0x4F75C85: R_forceAndCall (eval.c:1840)
==46718==    by 0x4FA2E26: do_mapply (mapply.c:105)
==46718==  Address 0x7 is not stack'd, malloc'd or (recently) free'd
==46718== 

 *** caught segfault ***
address 0x7, cause 'memory not mapped'

Traceback:
 1: .getPairedEnd(bam.file, where = where, param = param)
 ...
ADD COMMENTlink modified 8 months ago • written 8 months ago by sorjuelal0

That's all I needed to know. Looks like it's a failure with parsing the CIGAR string... though I don't know how this could happen. Please contact me offline at the csaw maintainer address if you want to be involved in further debugging - I will create a modified version of the package for you to run to see whether the same error is triggered.

ADD REPLYlink written 8 months ago by Aaron Lun24k

FYI if you install a debugger version of the package:

BiocManager::install("LTLA/csaw@debugger")

... the example will print the read name and cigar length up to the point that it fails. This'll tell us which read is the problem.

ADD REPLYlink modified 8 months ago • written 8 months ago by Aaron Lun24k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 275 users visited in the last hour