Search
Question: BiocParallel and CopywriteR Error
0
gravatar for genomic8328
4 months ago by
genomic83280 wrote:

I recently tried to use CopywriteR in Microsoft Azure cloud - Windows Server Datacenter Virtual MachineĀ  (128 RAM and 16 cores) with R 3.3.2. Also my input data files: normal 12.67GB, tumor 11GB

I received the following error:
Error: 'bplapply' receive data failed: error reading from connection

Can you suggest a work around? Maybe too many bam lines are being read at once?

Here is my code:

library("CopywriteR")
library("CopyhelpeR")
setwd("C:/Users/m/Desktop/share/data")
data.folder <- tools::file_path_as_absolute(file.path(getwd()))
preCopywriteR(output.folder=file.path(data.folder), bin.size=20000, ref.genome="hg38", prefix="chr")

list.dirs(path=file.path(data.folder), full.names=FALSE)
list.files(path=file.path(data.folder, "hg38_20kb_chr"), full.names=FALSE)
load(file=file.path(data.folder, "hg38_20kb_chr", "blacklist.rda"))
blacklist.grange

load(file=file.path(data.folder, "hg38_20kb_chr", "GC_mappability.rda"))
GC.mappa.grange[1001:1011]
bp.param <- SnowParam(workers = 15, type ="SOCK")
bp.param

path <- c("C:/Users/m/Desktop/share/data")
samples <- list.files(path=path, pattern="tumor.bam$", full.names=TRUE)
controls <- list.files(path=path, pattern="normal.bam$", full.names=TRUE)
sample.control <- data.frame(samples,controls)

CopywriteR(sample.control = sample.control, destination.folder = file.path(data.folder), reference.folder = file.path(data.folder, "hg38_20kb_chr"), bp.param = bp.param)
ADD COMMENTlink modified 4 months ago by Martin Morgan ♦♦ 20k • written 4 months ago by genomic83280
0
gravatar for t.kuilman
4 months ago by
t.kuilman100
Netherlands
t.kuilman100 wrote:

I am not sure whether this is an issue with CopywriteR; I think this might be an issue with BiocParallel (the package in which the bplapply function is specified) and/or an memory issue. I hope someone else can help with this issue.

ADD COMMENTlink written 4 months ago by t.kuilman100
0
gravatar for Martin Morgan
4 months ago by
Martin Morgan ♦♦ 20k
United States
Martin Morgan ♦♦ 20k wrote:

My guess is that the amount of data being returned by workers is too large to be represented in a serialized vector, I think probably 2^31 - 1 elements. Maybe traceback() would help understand where things are going wrong, and using SerialParam() a work-around (though obviously thwarting parallel evaluation).

ADD COMMENTlink modified 4 months ago • written 4 months ago by Martin Morgan ♦♦ 20k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.2.0
Traffic: 158 users visited in the last hour