Question: Problems running oligo functions in parallel on a linux computer
1
gravatar for s.munster
18 months ago by
s.munster30
USA/Oklahoma City
s.munster30 wrote:

   I am currently operating a linux (scientific linux 7.4) computer with 8 cores.  I would like to run some of the functions from oligo, like RMA, in parallel and have not had any success.

   I have tried the suggestions in the oligo manual to use ff or to use foreach and doMC and neither of these sped up computations or were successful in running calculations in parallel at all.

  I have tried using the parallel package and mclapply.  I just used "read.celfiles" in as my test and it will run in parallel, but does not give me one expression set with all of the celfiles.  It gives a separate expression set for each celfile, which is useless.

  I have also tried using BiocParallel, with the same test of "read.celfiles", also without luck.  bpiterate is "unable to input data, can't coerce S4 class into vector".  bpvec produces an error because the output vector length doesn't match the length of the vector for the starting data. bpmapply will parallelize and read in the celfiles, but like parallel, creates a separate expression set for each one.  

    I have checked, using taskset, and R is set to be able to use any or all eight cores, so R is not limited by a computer system setting.  Does anyone know of a way to have R run in parallel to do things like read.celfiles, oligo::rma, etc?  Most of the examples I have seen have to do with reading data out of a list and producing some output for each line, each file, etc.  Is this the only way R will run in parallel?

 

BiocParallel Code:

R version 3.4.2 (2017-09-28) -- "Short Summer"
Copyright (C) 2017 The R Foundation for Statistical Computing
Platform: x86_64-redhat-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

  Natural language support but running in an English locale

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

 > library(BiocParallel)
> param<-SnowParam(workers=6, type="SOCK")
> library(affy)

> library(oligo)

> a<-list.celfiles()

> FUN<-function(x){
+ library(oligo)
+ library(affy)
+ rawData<-read.celfiles(x)}
> bpvec(x, FUN, AGGREGATE=c, BPREDO=list(), BPPARAM=param)

Loading required package: BiocGenerics
Loading required package: parallel

Attaching package: ‘BiocGenerics’

Platform design info loaded.

Reading in : NF25-AL_2_(HTA-2_0)_3.CEL
Reading in : NF25-SA_2_(HTA-2_0).CEL
Reading in : NF27-AL_(HTA-2_0).CEL
Reading in : NF27-SA_(HTA-2_0).CEL
Reading in : NF32-AL_(HTA-2_0).CEL
Reading in : NF32-SA_(HTA-2_0).CEL
Reading in : NG01-AL_(HTA-2_0).CEL
Reading in : NG01-SA_(HTA-2_0).CEL
Reading in : NG07-AL_(HTA-2_0).CEL
Reading in : NG07-SA_(HTA-2_0).CEL
Reading in : NG09-AL_(HTA-2_0).CEL
Reading in : NG09-SA_(HTA-2_0).CEL
Reading in : NG16-AL_(HTA-2_0).CEL
Loading required package: BiocGenerics
Loading required package: parallel

Attaching package: ‘BiocGenerics’

Platform design info loaded.

Reading in : NF01-AL_(HTA-2_0).CEL
Reading in : NF01-SA_(HTA-2_0).CEL
Reading in : NF07-AL_(HTA-2_0).CEL
Reading in : NF07-SA_(HTA-2_0).CEL
Reading in : NF09-AL_(HTA-2_0).CEL
Reading in : NF09-SA_(HTA-2_0).CEL
Reading in : NF16-AL_(HTA-2_0).CEL
Reading in : NF16-SA_(HTA-2_0)_3.CEL
Reading in : NF19-AL_(HTA-2_0).CEL
Reading in : NF19-SA_(HTA-2_0).CEL
Reading in : NF21-AL_(HTA-2_0).CEL
Reading in : NF21-SA_(HTA-2_0).CEL
Reading in : NF24-AL_(HTA-2_0).CEL
Reading in : NF24-SA_(HTA-2_0).CEL

.....(this repeats, in chunks, to read in all 80 files)

Error: length(FUN(X)) not equal to length(X)
> bpiterate(x, FUN, BPPARAM=param, REDUCE=merge)
Error in as.list.default(X) : 
  no method for coercing this S4 class to a vector
Calls: local ... doTryCatch -> bpok -> vapply -> as.list -> as.list.default
Execution halted
Error in as.list.default(X) : 
  no method for coercing this S4 class to a vector
Calls: local ... doTryCatch -> bpok -> vapply -> as.list -> as.list.default
Execution halted
Error: 'bpiterate' receive data failed:
  error reading from connection
> Error in as.list.default(X) : 
  no method for coercing this S4 class to a vector
Calls: local ... doTryCatch -> bpok -> vapply -> as.list -> as.list.default
Execution halted
Error in as.list.default(X) : 
  no method for coercing this S4 class to a vector
Calls: local ... doTryCatch -> bpok -> vapply -> as.list -> as.list.default
Execution halted
Error in as.list.default(X) : 
  no method for coercing this S4 class to a vector
Calls: local ... doTryCatch -> bpok -> vapply -> as.list -> as.list.default
Execution halted
Error in as.list.default(X) : 
  no method for coercing this S4 class to a vector
Calls: local ... doTryCatch -> bpok -> vapply -> as.list -> as.list.default
Execution halted

> bpmapply(FUN, a, BPPARAM=param)

Loading required package: BiocGenerics
Loading required package: parallel

Attaching package: ‘BiocGenerics’

Platform design info loaded.

Reading in : SF04-AL_(HTA-2_0).CEL
Platform design info loaded.
Reading in : SF04-SA_(HTA-2_0).CEL
Platform design info loaded.
Reading in : SF10-AL_(HTA-2_0).CEL
Platform design info loaded.
Reading in : SF10-SA_(HTA-2_0).CEL
Platform design info loaded.
Reading in : SF15-AL_(HTA-2_0).CEL
Platform design info loaded.
Reading in : SF15-SA_(HTA-2_0).CEL
Platform design info loaded.
Reading in : SF18-AL_(HTA-2_0).CEL
Platform design info loaded.
Reading in : SF18-SA_(HTA-2_0).CEL
Platform design info loaded.
Reading in : SF20-AL_(HTA-2_0).CEL
Platform design info loaded.
Reading in : SF20-SA_(HTA-2_0).CEL
Platform design info loaded.
Reading in : SF23-AL_(HTA-2_0).CEL
Platform design info loaded.
Reading in : SF23-SA_(HTA-2_0).CEL
Platform design info loaded.
Reading in : SF26-AL_(HTA-2_0).CEL
Loading required package: BiocGenerics
Loading required package: parallel

Attaching package: ‘BiocGenerics’

Loading required package: pd.hta.2.0

Loading required package: RSQLite
Loading required package: DBI
Platform design info loaded.
Reading in : NF01-AL_(HTA-2_0).CEL
Platform design info loaded.
Reading in : NF01-SA_(HTA-2_0).CEL
Platform design info loaded.
Reading in : NF07-AL_(HTA-2_0).CEL
Platform design info loaded.
Reading in : NF07-SA_(HTA-2_0).CEL
Platform design info loaded.
Reading in : NF09-AL_(HTA-2_0).CEL
Platform design info loaded.
Reading in : NF09-SA_(HTA-2_0).CEL
Platform design info loaded.
Reading in : NF16-AL_(HTA-2_0).CEL
Platform design info loaded.
Reading in : NF16-SA_(HTA-2_0)_3.CEL
Platform design info loaded.
Reading in : NF19-AL_(HTA-2_0).CEL
Platform design info loaded.
Reading in : NF19-SA_(HTA-2_0).CEL
Platform design info loaded.
Reading in : NF21-AL_(HTA-2_0).CEL
Platform design info loaded.
Reading in : NF21-SA_(HTA-2_0).CEL
Platform design info loaded.
Reading in : NF24-AL_(HTA-2_0).CEL
Platform design info loaded.
Reading in : NF24-SA_(HTA-2_0).CEL

..... (it did this for all 80 files)

$`NF01-AL_(HTA-2_0).CEL`
HTAFeatureSet (storageMode: lockedEnvironment)
assayData: 6892960 features, 1 samples 
  element names: exprs 
protocolData
  rowNames: NF01-AL_(HTA-2_0).CEL
  varLabels: exprs dates
  varMetadata: labelDescription channel
phenoData
  rowNames: NF01-AL_(HTA-2_0).CEL
  varLabels: index
  varMetadata: labelDescription channel
featureData: none
experimentData: use 'experimentData(object)'
Annotation: pd.hta.2.0 

$`NF01-SA_(HTA-2_0).CEL`
HTAFeatureSet (storageMode: lockedEnvironment)
assayData: 6892960 features, 1 samples 
  element names: exprs 
protocolData
  rowNames: NF01-SA_(HTA-2_0).CEL
  varLabels: exprs dates
  varMetadata: labelDescription channel
phenoData
  rowNames: NF01-SA_(HTA-2_0).CEL
  varLabels: index
  varMetadata: labelDescription channel
featureData: none
experimentData: use 'experimentData(object)'
Annotation: pd.hta.2.0 

..... (there were 80 of these....)

> bpmapply(FUN, a, BPPARAM=param, SIMPLIFLY=TRUE, USE.NAME=TRUE)
Error: BiocParallel errors
  element index: 1, 2, 3, 4, 5, 6, ...
  first error: unused arguments (SIMPLIFLY = dots[[2]][[1]], USE.NAME = dots[[3]][[1]])
> bpmapply(FUN, a, BPPARAM=param, SIMPLIFLY=TRUE)
Error: BiocParallel errors
  element index: 1, 2, 3, 4, 5, 6, ...
  first error: unused argument (SIMPLIFLY = dots[[2]][[1]])
> ?bpaggregate
> ?bpaggregate
> bpaggregate(a, FUN, BPPARAM=param)
Error in (function (classes, fdef, mtable)  : 
  unable to find an inherited method for function ‘bpaggregate’ for signature ‘"character", "SnowParam"’

 

 

 

 

 

ADD COMMENTlink written 18 months ago by s.munster30
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 316 users visited in the last hour