I am following the GAGE pathway analysis workflow from the 'gene_exp.diff' file produced by Cuffdiff v1.3.0 (part 7.5 here) and get the following error when using path view:
> pathview(gene.data = exp.fc, pathway.id = "04110", species = "hsa", same.layer=F, kegg.native = T, out.suffix=out.suffix) Info: Downloading xml files for hsa04115, 1/1 pathways.. Info: Downloading png files for hsa04115, 1/1 pathways.. *** caught segfault *** address 0x7f8400000000, cause 'memory not mapped' Traceback: 1: .Call("RS_XML_ParseTree", as.character(file), handlers, as.logical(ignoreBlanks), as.logical(replaceEntities), as.logical(asText), as.logical(trim), as.logical(validate), as.logical(getDTD), as.logical(isURL), as.logical(addAttributeNamespaces), as.logical(useInternalNodes), as.logical(isHTML), as.logical(isSchema), as.logical(fullNamespaceInfo), as.character(encoding), as.logical(useDotNames), xinclude, error, addFinalizer, as.integer(options), as.logical(parentFirst), PACKAGE = "XML") 2: xmlTreeParse(file, getDTD = FALSE) 3: parseKGML2(object) 4: node.info(xml.file[i]) 5: doTryCatch(return(expr), name, parentenv, handler) 6: tryCatchOne(expr, names, parentenv, handlers[[1L]]) 7: tryCatchList(expr, classes, parentenv, handlers) 8: tryCatch(expr, error = function(e) { call <- conditionCall(e) if (!is.null(call)) { if (identical(call[[1L]], quote(doTryCatch))) call <- sys.call(-4L) dcall <- deparse(call)[1L] prefix <- paste("Error in", dcall, ": ") LONG <- 75L msg <- conditionMessage(e) sm <- strsplit(msg, "\n")[[1L]] w <- 14L + nchar(dcall, type = "w") + nchar(sm[1L], type = "w") if (is.na(w)) w <- 14L + nchar(dcall, type = "b") + nchar(sm[1L], type = "b") if (w > LONG) prefix <- paste0(prefix, "\n ") } else prefix <- "Error : " msg <- paste0(prefix, conditionMessage(e), "\n") .Internal(seterrmessage(msg[1L])) if (!silent && identical(getOption("show.error.messages"), TRUE)) { cat(msg, file = stderr()) .Internal(printDeferredWarnings()) } invisible(structure(msg, class = "try-error", condition = e))}) 9: try(node.info(xml.file[i]), silent = T) 10: pathview(gene.data = exp.fc, pathway.id = "04115", species = "hsa", same.layer = F, kegg.native = T, out.suffix = out.suffix) Possible actions: 1: abort (with core dump, if enabled) 2: normal R exit 3: exit R without saving workspace 4: exit R saving workspace
My session Info:
> sessionInfo() R version 3.2.2 (2015-08-14) Platform: x86_64-pc-linux-gnu (64-bit) locale: [1] C attached base packages: [1] stats4 parallel stats graphics grDevices utils datasets [8] methods base other attached packages: [1] gage_2.20.0 [2] pathview_1.10.1 [3] org.Hs.eg.db_3.2.3 [4] RSQLite_1.0.0 [5] DBI_0.3.1 [6] GenomicAlignments_1.6.1 [7] Rsamtools_1.22.0 [8] Biostrings_2.38.2 [9] XVector_0.10.0 [10] SummarizedExperiment_1.0.1 [11] TxDb.Hsapiens.UCSC.hg19.knownGene_3.2.2 [12] GenomicFeatures_1.22.6 [13] AnnotationDbi_1.32.2 [14] Biobase_2.30.0 [15] GenomicRanges_1.22.1 [16] GenomeInfoDb_1.6.1 [17] IRanges_2.4.5 [18] S4Vectors_0.8.4 [19] BiocGenerics_0.16.1 loaded via a namespace (and not attached): [1] graph_1.48.0 KEGGgraph_1.28.0 magrittr_1.5 [4] zlibbioc_1.16.0 BiocParallel_1.4.0 R6_2.1.1 [7] stringr_1.0.0 httr_1.0.0 tools_3.2.2 [10] grid_3.2.2 png_0.1-7 lambda.r_1.1.7 [13] futile.logger_1.4.1 Rgraphviz_2.14.0 rtracklayer_1.30.1 [16] futile.options_1.0.0 bitops_1.0-6 KEGGREST_1.10.0 [19] RCurl_1.95-4.7 biomaRt_2.26.1 stringi_1.0-1 [22] XML_3.98-1.3
I also noticed this output along the way (not errors but probably not a good sign neither):
> gnames.eg=pathview::id2eg(gnames, category ="symbol") 'select()' returned 1:many mapping between keys and columns [1] "Note: 1075 of 25103 unique input IDs unmapped." > sel2=gnames.eg[,2]>"" > cuff.fc=cuff.fc[sel2] > names(cuff.fc)=gnames.eg[sel2,2] > range(cuff.fc) [1] NA NA
I tried a lot of different pathway IDs. I always got this error, the .xml and .png files are downloaded in my working directory but without any of my values mapped on it (even when specifying an expression difference of < or >0.1).
It looks like the segfault is from the XML package, though it might be an interaction with another program; you might try adding
trace(XML::xmlTreeParse, quote(print(file)))
before the call to pathview, and print all the output. It will be hard to identify the problem without a fully reproducible example.I tried the trace command you mentioned but nothing output except the error message I described in my first post.
Note:
* Before the error occurred, I also updated XML package.
* I am running the code on a cluster (no other programs are running)
You must then make your example reproducible, either by providing access to exp.fc and out.suffix (or better a simplified subset of this data), or by providing an example that fails while using publicly available data.
I tried the exact same code with the exact same data on my laptop (Mac) and it works just fine.
I also tried other datasets and always get the same results: work locally but not on the cluster.
I checked that the exact same packages and R version were installed on both machines (I get the exact same sessionInfo() on both laptop and cluster), but still returns the error message mentioned in my previous post.
Do you have any idea what could be wrong? (knowing that I use a fresh installation on the cluster where nothing is shared).