Hi,
I hope someone can help with this. You will see by my lack of providing the appropriate information necessary for you to actually diagnose this (ie, error logs) that I am an R noob. Please point me to the correct logs and I will happily provide them!
The edgeR function processAmplicons is causing R (rstudio Version 0.98.978 ) to crash. I have it working with a normal number of reads (20 million) and a small hairpin table (200 hairpins is no problem) but it crashes if I feed it my whole hairpin list of ~120,000. It generally (but not always) reports:
-- Number of Barcodes : 4 -- Number of Hairpins : 119461
Then it sits indefinitely and eventually reports the following (see below) and restarts. Usually this coincides with me clicking to another tab or something in chrome but I have let it go uninterrupted for 8+ hours with no progress. 200 hairpins takes about a minute to process and reports at the completion of each 10million reads. 120,000 hairpins will not process no matter how small i make the data set and never reports progress, just hanging after or during reading in the hairpins.
21 May 2015 20:29:15 [rsession-*****] ERROR session hadabend; LOGGED FROM: core::Error<unnamed>::rInit(const r::session::RInitInfo&) /root/rstudio/src/cpp/session/SessionMain.cpp:1694
Session info added 21:37 EDT
R version 3.1.1 (2014-07-10) Platform: x86_64-unknown-linux-gnu (64-bit) locale: [1] C attached base packages: character(0) other attached packages: [1] edgeR_3.8.6 loaded via a namespace (and not attached): [1] Rcpp_0.11.3 base_3.1.1 colorspace_1.2-4 datasets_3.1.1 grDevices_3.1.1 [6] graphics_3.1.1 limma_3.22.7 methods_3.1.1 stats_3.1.1 tools_3.1.1 [11] utils_3.1.1
How large are the files? Is it possible to post a minimal working example (e.g., on DropBox) that reproduces the problem?
Also, you haven't posted your
sessionInfo
, but make sure that you're using the latest version ofedgeR
. I don't think this'll solve your problem as the hairpin processing code has been stable for several versions; but, we should make sure, just in case.sessionInfo is above.
I made a very small example set to demonstrate the failure. Please see the .R file in the link below.
https://www.dropbox.com/sh/dbwuukjixar16y0/AADj3vFtU0NIfLaf6NXEC-Lfa?dl=0
In the meanwhile, the person in charge of our rstudio install has suggested I run it on the command line on our cluster. I erroneously believed that our rstudio was set to go through the interactive cue on our scheduler but apparently it's just running on a couple nodes or something.
Thanks again.
Running R/3.1.0 on command line I get the following error with the big (120,000) hairpin list:
> source("sgrnalib.R")
-- Number of Barcodes : 4
-- Number of Hairpins : 119461
*** caught segfault ***
address (nil), cause 'memory not mapped'
Traceback:
1: .C(.cprocessHairpinReads, IsPairedReads, readfile, readfile2, as.integer(length(readfile)), as.character(tempbarcodefile), as.character(temphairpinfile), as.integer(barcodeStart), as.integer(barcodeEnd), as.integer(barcodeStartRev), as.integer(barcodeEndRev), as.integer(hairpinStart), as.integer(hairpinEnd), as.integer(allowShifting), as.integer(shiftingBase), as.integer(allowMismatch), as.integer(barcodeMismatchBase), as.integer(hairpinMismatchBase), as.integer(allowShiftedMismatch), as.character(tempoutfile), as.integer(verbose))
2: doTryCatch(return(expr), name, parentenv, handler)
3: tryCatchOne(expr, names, parentenv, handlers[[1L]])
4: tryCatchList(expr, classes, parentenv, handlers)
5: tryCatch({ if (!IsPairedReads) { readfile2 = "DummyReadfile.fastq" barcodeStartRev = 0 barcodeEndRev = 0 } .C(.cprocessHairpinReads, IsPairedReads, readfile, readfile2, as.integer(length(readfile)), as.character(tempbarcodefile), as.character(temphairpinfile), as.integer(barcodeStart), as.integer(barcodeEnd), as.integer(barcodeStartRev), as.integer(barcodeEndRev), as.integer(hairpinStart), as.integer(hairpinEnd), as.integer(allowShifting), as.integer(shiftingBase), as.integer(allowMismatch), as.integer(barcodeMismatchBase), as.integer(hairpinMismatchBase), as.integer(allowShiftedMismatch), as.character(tempoutfile), as.integer(verbose)) hairpinReadsSummary <- read.table(tempoutfile, sep = "\t", header = FALSE)}, error = function(err) { print(paste("ERROR MESSAGE: ", err))}, finally = { if (file.exists(tempbarcodefile)) file.remove(tempbarcodefile) if (file.exists(temphairpinfile)) file.remove(temphairpinfile) if (file.exists(tempoutfile)) file.remove(tempoutfile)})
6: processAmplicons("plasmidpart.fastq", readfile2 = NULL, "Samples1.txt", "sgrnauniqueAB.txt", barcodeStart = 1, barcodeEnd = 9, hairpinStart = 34, hairpinEnd = 53, allowShifting = TRUE, shiftingBase = 2, allowMismatch = TRUE, barcodeMismatchBase = 2, hairpinMismatchBase = 2, allowShiftedMismatch = TRUE, verbose = TRUE)
7: eval(expr, envir, enclos)
8: eval(ei, envir)
9: withVisible(eval(ei, envir))
10: source("sgrnalib.R")
Possible actions:
1: abort (with core dump, if enabled)
2: normal R exit
3: exit R without saving workspace
4: exit R saving workspace
Selection: