Hi,
I have previously used the DiffBind package to successfully analyze some ChIP-Seq data, but after upgrading R (version 3.2.1) and downloading the latest package version the dba.count() function in DiffBind (version 1.14.4) now results in the following error on a Linux machine:
> diffh3k4me3 = dba.count(diffh3k4me3,bParallel=FALSE) Sample: ../data/K562H3k4me3chr1.bed125 *** caught segfault *** address 0xffffffff0edb1b9e, cause 'memory not mapped' Traceback: 1: .Call("croi_count_reads", bamfile, as.integer(insertLength), as.integer(fileType), as.integer(bufferSize), as.integer(minMappingQual), as.character(intervals[[1]]), as.integer(intervals[[2]]), as.integer(intervals[[3]]), as.integer(icount), as.logical(bWithoutDupes), as.logical(bSummits), counts, summits.vec, heights.vec) 2: cpp_count_reads(bamfile, insertLength, fileType, bufferSize, intervals, bWithoutDupes, summits, minMappingQuality) 3: pv.getCounts(bamfile = countrec$bamfile, intervals = intervals, insertLength = countrec$insert, bWithoutDupes = bWithoutDupes, bLowMem = bLowMem, yieldSize = yieldSize, mode = mode, singleEnd = singleEnd, scanbamparam = scanbamparam, fileType = fileType, summits = summits, fragments = fragments, minMappingQuality = minMappingQuality) 4: pv.do_getCounts(job, bed, bWithoutDupes = bWithoutDupes, bLowMem, yieldSize, mode, singleEnd, scanbamparam, readFormat, summits, fragments, minMappingQuality) 5: pv.listadd(results, pv.do_getCounts(job, bed, bWithoutDupes = bWithoutDupes, bLowMem, yieldSize, mode, singleEnd, scanbamparam, readFormat, summits, fragments, minMappingQuality)) 6: pv.counts(DBA, peaks = peaks, minOverlap = minOverlap, defaultScore = score, bLog = bLog, insertLength = fragmentSize, bOnlyCounts = T, bCalledMasks = TRUE, minMaxval = filter, bParallel = bParallel, bUseLast = bUseLast, bWithoutDupes = bRemoveDuplicates, bScaleControl = bScaleControl, filterFun = filterFun, bLowMem = bUseSummarizeOverlaps, readFormat = readFormat, summits = summits, minMappingQuality = mapQCth) 7: dba.count(diffh3k4me3, bParallel = FALSE)
I've read through some similar help topics such as this one: About the DiffBind dba.count() crash problems
But it looks like the advice given is to use the most updated package which I think I am doing.
Note that the first few lines of my .bed files look like so:
chr1 10077 10113 SL-XBE_3_FC30CLLAAXX:6:37:844:899 0 - chr1 10103 10139 SL-XBE_3_FC30CLLAAXX:6:46:95:1579 0 + chr1 10236 10272 SL-XBE_3_FC30CLLAAXX:6:31:1257:999 48 + chr1 10240 10276 SL-XBE_3_FC30CLLAAXX:6:43:601:1863 19 -
Also, I have tried using the same package version, data, and R version on an OSX machine and receive a different error when calling dba.count():
Error in if (sum(tokeep) < length(tokeep)) { : missing value where TRUE/FALSE needed
This error seems to be similar to the one encountered in this help topic: http://seqanswers.com/forums/showthread.php?t=46345
But in that topic the error was on a Windows machine and it was resolved by using a Mac!
I'm not sure what to try next and I appreciate any help, thanks!
Full session info for the Linux machine run:
> sessionInfo() R version 3.2.0 (2015-04-16) Platform: x86_64-unknown-linux-gnu (64-bit) Running under: Arch Linux locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=en_US.UTF-8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] stats4 parallel stats graphics grDevices utils datasets [8] methods base other attached packages: [1] BayesPeak_1.18.2 DiffBind_1.14.4 RSQLite_1.0.0 [4] DBI_0.3.1 locfit_1.5-9.1 GenomicAlignments_1.2.2 [7] Rsamtools_1.20.4 Biostrings_2.36.1 XVector_0.8.0 [10] limma_3.22.7 GenomicRanges_1.20.5 GenomeInfoDb_1.4.1 [13] IRanges_2.2.5 S4Vectors_0.6.1 BiocGenerics_0.14.0 loaded via a namespace (and not attached): [1] Rcpp_0.11.4 lattice_0.20-31 GO.db_3.1.2 [4] gtools_3.4.1 digest_0.6.8 foreach_1.4.2 [7] plyr_1.8.1 BatchJobs_1.6 ShortRead_1.26.0 [10] ggplot2_1.0.0 gplots_2.16.0 zlibbioc_1.12.0 [13] annotate_1.46.0 gdata_2.13.3 Matrix_1.2-0 [16] checkmate_1.5.2 systemPipeR_1.2.8 proto_0.3-10 [19] GOstats_2.34.0 splines_3.2.0 BiocParallel_1.0.3 [22] stringr_0.6.2 pheatmap_1.0.2 munsell_0.4.2 [25] sendmailR_1.2-1 base64enc_0.1-2 BBmisc_1.9 [28] fail_1.2 edgeR_3.8.6 codetools_0.2-11 [31] XML_3.98-1.1 AnnotationForge_1.10.1 MASS_7.3-40 [34] bitops_1.0-6 grid_3.2.0 RBGL_1.44.0 [37] xtable_1.7-4 GSEABase_1.30.2 gtable_0.1.2 [40] scales_0.2.4 graph_1.44.1 KernSmooth_2.23-14 [43] amap_0.8-14 hwriter_1.3.2 reshape2_1.4.1 [46] genefilter_1.48.1 latticeExtra_0.6-26 brew_1.0-6 [49] rjson_0.2.15 RColorBrewer_1.1-2 iterators_1.0.7 [52] tools_3.2.0 Biobase_2.26.0 Category_2.34.2 [55] survival_2.38-1 AnnotationDbi_1.30.1 colorspace_1.2-6 [58] caTools_1.17.1
Hi,
The read-counting code hasn't changed for several releases. Can you show the *last* few lines of your bed file? My guess is that the file is corrupted in some way (truncated last line, for example). Alternatively, can you make the file available on the web somewhere, so I can download it and examine it?
By the way, we'll probably remove support for bed files (for the reads) and only support BAM in the next release or two, so you might want to look into moving to BAM format.
- Gord