Search
Question: snow library, question on clusterExport
0
gravatar for mattia pelizzola
7.6 years ago by
mattia pelizzola200 wrote:
Hi, I have a simple function: > library(snow) > fun2=function() { + cl=makeCluster(3) + Mat=matrix(2:10,3,3) + fun3=function(startInd, endInd=3, data=Mat) {Mat[startInd:endInd,]} + print(clusterApplyLB(cl, 1:3, fun3)) + stopCluster(cl) + } that is working fine: > fun2() [[1]] [,1] [,2] [,3] [1,] 2 5 8 [2,] 3 6 9 [3,] 4 7 10 [[2]] [,1] [,2] [,3] [1,] 3 6 9 [2,] 4 7 10 [[3]] [1] 4 7 10 now, if I run the same commands outside the function: > cl=makeCluster(3) > Mat=matrix(2:10,3,3) > fun3=function(startInd, endInd=3, data=Mat) {Mat[startInd:endInd,]} > print(clusterApplyLB(cl, 1:3, fun3)) Error in checkForRemoteErrors(val) : 3 nodes produced errors; first error: object 'Mat' not found so I figured out I have to export 'Mat' on the cluster nodes: > clusterExport(cl, 'Mat') > print(clusterApplyLB(cl, 1:3, fun3)) [[1]] [,1] [,2] [,3] [1,] 2 5 8 [2,] 3 6 9 [3,] 4 7 10 [[2]] [,1] [,2] [,3] [1,] 3 6 9 [2,] 4 7 10 [[3]] [1] 4 7 10 I still do not understand why clusterExport is NOT necessary within the function 'fun2' and actually it would give an error: > rm(Mat) > fun2=function() { + cl=makeCluster(3) + Mat=matrix(2:10,3,3) + clusterExport(cl, 'Mat') + fun3=function(startInd, endInd=3, data=Mat) {Mat[startInd:endInd,]} + print(clusterApplyLB(cl, 1:3, fun3)) + stopCluster(cl) + } > fun2() Error in get(name, env = .GlobalEnv) : object 'Mat' not found I found clusterExport to be the solution for a more complex example, can I can't make it working within a function. What is it happening here with clusterExport? and how can I export an object that is not on my globalEnv but rather is created within a function? many thanks! mattia > sessionInfo() R version 2.10.1 (2009-12-14) x86_64-unknown-linux-gnu locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 [5] LC_MONETARY=C LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=en_US.UTF-8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] snow_0.3-3
ADD COMMENTlink modified 7.6 years ago by Pavelka, Norman70 • written 7.6 years ago by mattia pelizzola200
0
gravatar for Martin Morgan
7.6 years ago by
Martin Morgan ♦♦ 20k
United States
Martin Morgan ♦♦ 20k wrote:
Hi Mattia -- Probably the newsgroup https://stat.ethz.ch/mailman/listinfo/r-sig-hpc is appropriate, but... On 04/09/2010 03:47 PM, mattia pelizzola wrote: > Hi, > > I have a simple function: > >> library(snow) >> fun2=function() { > + cl=makeCluster(3) > + Mat=matrix(2:10,3,3) > + fun3=function(startInd, endInd=3, data=Mat) {Mat[startInd:endInd,]} > + print(clusterApplyLB(cl, 1:3, fun3)) > + stopCluster(cl) > + } A function includes, as part of its definition, the environment it is defined in. So f1 <- function() { f2 <- function() {} x <- 1 browser() } > f1() Called from: f1() Browse[1]> environment(f2) <environment: 0xb540b0=""> Browse[1]> ls(environment(f2)) [1] "f2" "x" Browse[1]> environment(f2)[["x"]] [1] 1 In something like clusterApplyLB, snow sends 'fun3' to the worker. This includes 'fun3's environment, and that in turn includes the variable 'Mat'. Note that this could be a big surprise, e.g., f1 = function() { f2 = function(i) i^2 m = matrix(numeric(1e7), 1e3) clusterApplyLB(cl, 1:10, f2) } sends the matrix 'm' to each node in the cluster (because it is defined in the evironment of f2), even though it is irrelevant to the calculation performed by f2. To illustrate f1 <- function(cl, x, do) { f2 <- function(i) ls(environment()) y <- x if (do) clusterApply(cl, 1:2, f2) } this sends a short vector > x <- integer(1); system.time(f1(cl, x, TRUE)) user system elapsed 0.000 0.000 0.001 and a long vector, so takes more time > x <- integer(1e6); system.time(f1(cl, x, TRUE)) user system elapsed 0.096 0.040 0.329 and here demonstrating that it's not the vector per se, but the transport > x <- integer(1e6); system.time(f1(cl, x, FALSE)) user system elapsed 0 0 0 > that is working fine: > >> fun2() > [[1]] > [,1] [,2] [,3] > [1,] 2 5 8 > [2,] 3 6 9 > [3,] 4 7 10 > > [[2]] > [,1] [,2] [,3] > [1,] 3 6 9 > [2,] 4 7 10 > > [[3]] > [1] 4 7 10 > > now, if I run the same commands outside the function: > >> cl=makeCluster(3) >> Mat=matrix(2:10,3,3) >> fun3=function(startInd, endInd=3, data=Mat) {Mat[startInd:endInd,]} >> print(clusterApplyLB(cl, 1:3, fun3)) > Error in checkForRemoteErrors(val) : > 3 nodes produced errors; first error: object 'Mat' not found Here snow has a special rule, which is 'do not export the global environment'. So environment(fun3) == .GlobalEnv, and 'Mat' is not exported, and not available to the worker. > > so I figured out I have to export 'Mat' on the cluster nodes: > >> clusterExport(cl, 'Mat') >> print(clusterApplyLB(cl, 1:3, fun3)) > [[1]] > [,1] [,2] [,3] > [1,] 2 5 8 > [2,] 3 6 9 > [3,] 4 7 10 > > [[2]] > [,1] [,2] [,3] > [1,] 3 6 9 > [2,] 4 7 10 > > [[3]] > [1] 4 7 10 > > I still do not understand why clusterExport is NOT necessary within > the function 'fun2' and actually it would give an error: > >> rm(Mat) >> fun2=function() { > + cl=makeCluster(3) > + Mat=matrix(2:10,3,3) > + clusterExport(cl, 'Mat') > + fun3=function(startInd, endInd=3, data=Mat) {Mat[startInd:endInd,]} > + print(clusterApplyLB(cl, 1:3, fun3)) > + stopCluster(cl) > + } >> fun2() > Error in get(name, env = .GlobalEnv) : object 'Mat' not found from ?clusterExport, ?clusterExport? assigns the global values on the master of the variables named in ?list? to variables of the same names in the global environments of each node. so snow is just doing what it is documented to do. > > I found clusterExport to be the solution for a more complex example, > can I can't make it working within a function. > What is it happening here with clusterExport? and how can I export an > object that is not on my globalEnv but rather is created within a > function? Hope that provides enough information to work through your problem. Martin > > many thanks! > > mattia > >> sessionInfo() > R version 2.10.1 (2009-12-14) > x86_64-unknown-linux-gnu > > locale: > [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C > [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 > [5] LC_MONETARY=C LC_MESSAGES=en_US.UTF-8 > [7] LC_PAPER=en_US.UTF-8 LC_NAME=C > [9] LC_ADDRESS=C LC_TELEPHONE=C > [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C > > attached base packages: > [1] stats graphics grDevices utils datasets methods base > > other attached packages: > [1] snow_0.3-3 > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor -- Martin Morgan Computational Biology / Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 Location: Arnold Building M1 B861 Phone: (206) 667-2793
ADD COMMENTlink written 7.6 years ago by Martin Morgan ♦♦ 20k
0
gravatar for Pavelka, Norman
7.6 years ago by
Pavelka, Norman70 wrote:
Hi Mattia, Maybe I'm not getting what you're trying to do, but shouldn't your fun3 be using object 'data' rather than 'Mat' internally? HTH ;-) Norman # Hi, # # I have a simple function: # # > library(snow) # > fun2=function() { # + cl=makeCluster(3) # + Mat=matrix(2:10,3,3) # + fun3=function(startInd, endInd=3, data=Mat) {Mat[startInd:endInd,]} # + print(clusterApplyLB(cl, 1:3, fun3)) # + stopCluster(cl) # + } # # that is working fine: # # > fun2() # [[1]] # [,1] [,2] [,3] # [1,] 2 5 8 # [2,] 3 6 9 # [3,] 4 7 10 # # [[2]] # [,1] [,2] [,3] # [1,] 3 6 9 # [2,] 4 7 10 # # [[3]] # [1] 4 7 10 # # now, if I run the same commands outside the function: # # > cl=makeCluster(3) # > Mat=matrix(2:10,3,3) # > fun3=function(startInd, endInd=3, data=Mat) {Mat[startInd:endInd,]} # > print(clusterApplyLB(cl, 1:3, fun3)) # Error in checkForRemoteErrors(val) : # 3 nodes produced errors; first error: object 'Mat' not found # # so I figured out I have to export 'Mat' on the cluster nodes: # # > clusterExport(cl, 'Mat') # > print(clusterApplyLB(cl, 1:3, fun3)) # [[1]] # [,1] [,2] [,3] # [1,] 2 5 8 # [2,] 3 6 9 # [3,] 4 7 10 # # [[2]] # [,1] [,2] [,3] # [1,] 3 6 9 # [2,] 4 7 10 # # [[3]] # [1] 4 7 10 # # I still do not understand why clusterExport is NOT necessary within # the function 'fun2' and actually it would give an error: # # > rm(Mat) # > fun2=function() { # + cl=makeCluster(3) # + Mat=matrix(2:10,3,3) # + clusterExport(cl, 'Mat') # + fun3=function(startInd, endInd=3, data=Mat) {Mat[startInd:endInd,]} # + print(clusterApplyLB(cl, 1:3, fun3)) # + stopCluster(cl) # + } # > fun2() # Error in get(name, env = .GlobalEnv) : object 'Mat' not found # # # I found clusterExport to be the solution for a more complex example, # can I can't make it working within a function. # What is it happening here with clusterExport? and how can I export an # object that is not on my globalEnv but rather is created within a # function? # # many thanks! # # mattia # # > sessionInfo() # R version 2.10.1 (2009-12-14) # x86_64-unknown-linux-gnu # # locale: # [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C # [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 # [5] LC_MONETARY=C LC_MESSAGES=en_US.UTF-8 # [7] LC_PAPER=en_US.UTF-8 LC_NAME=C # [9] LC_ADDRESS=C LC_TELEPHONE=C # [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C # # attached base packages: # [1] stats graphics grDevices utils datasets methods base # # other attached packages: # [1] snow_0.3-3
ADD COMMENTlink written 7.6 years ago by Pavelka, Norman70
thanks Martin for the explanations and thanks Norman for pointing out that error in the example, unfortunately I am still stuck with the main problem: I have to use clusterExport to export an object to the cluster nodes. clusterExport only seems to export objects from the GlobalEnv, unfortunately. In my case this object is created within a function and clusterExport is called within the same function, so the object is not available in the GlobalEnv and I get error .. I'll try writing to the other mailing list, thanks mattia On Sun, Apr 11, 2010 at 8:49 AM, Pavelka, Norman <nxp at="" stowers.org=""> wrote: > Hi Mattia, > > Maybe I'm not getting what you're trying to do, but shouldn't your fun3 be using object 'data' rather than 'Mat' internally? > > HTH ;-) > Norman > > # ?Hi, > # > # ?I have a simple function: > # > # ?> library(snow) > # ?> fun2=function() { > # ?+ cl=makeCluster(3) > # ?+ Mat=matrix(2:10,3,3) > # ?+ fun3=function(startInd, endInd=3, data=Mat) {Mat[startInd:endInd,]} > # ?+ print(clusterApplyLB(cl, 1:3, fun3)) > # ?+ stopCluster(cl) > # ?+ } > # > # ?that is working fine: > # > # ?> fun2() > # ?[[1]] > # ? ? ? [,1] [,2] [,3] > # ?[1,] ? ?2 ? ?5 ? ?8 > # ?[2,] ? ?3 ? ?6 ? ?9 > # ?[3,] ? ?4 ? ?7 ? 10 > # > # ?[[2]] > # ? ? ? [,1] [,2] [,3] > # ?[1,] ? ?3 ? ?6 ? ?9 > # ?[2,] ? ?4 ? ?7 ? 10 > # > # ?[[3]] > # ?[1] ?4 ?7 10 > # > # ?now, if I run the same commands outside the function: > # > # ?> cl=makeCluster(3) > # ?> Mat=matrix(2:10,3,3) > # ?> fun3=function(startInd, endInd=3, data=Mat) {Mat[startInd:endInd,]} > # ?> print(clusterApplyLB(cl, 1:3, fun3)) > # ?Error in checkForRemoteErrors(val) : > # ? ?3 nodes produced errors; first error: object 'Mat' not found > # > # ?so I figured out I have to export 'Mat' on the cluster nodes: > # > # ?> clusterExport(cl, 'Mat') > # ?> print(clusterApplyLB(cl, 1:3, fun3)) > # ?[[1]] > # ? ? ? [,1] [,2] [,3] > # ?[1,] ? ?2 ? ?5 ? ?8 > # ?[2,] ? ?3 ? ?6 ? ?9 > # ?[3,] ? ?4 ? ?7 ? 10 > # > # ?[[2]] > # ? ? ? [,1] [,2] [,3] > # ?[1,] ? ?3 ? ?6 ? ?9 > # ?[2,] ? ?4 ? ?7 ? 10 > # > # ?[[3]] > # ?[1] ?4 ?7 10 > # > # ?I still do not understand why clusterExport is NOT necessary within > # ?the function 'fun2' and actually it would give an error: > # > # ?> rm(Mat) > # ?> fun2=function() { > # ?+ cl=makeCluster(3) > # ?+ Mat=matrix(2:10,3,3) > # ?+ clusterExport(cl, 'Mat') > # ?+ fun3=function(startInd, endInd=3, data=Mat) {Mat[startInd:endInd,]} > # ?+ print(clusterApplyLB(cl, 1:3, fun3)) > # ?+ stopCluster(cl) > # ?+ } > # ?> fun2() > # ?Error in get(name, env = .GlobalEnv) : object 'Mat' not found > # > # > # ?I found clusterExport to be the solution for a more complex example, > # ?can I can't make it working within a function. > # ?What is it happening here with clusterExport? and how can I export an > # ?object that is not on my globalEnv but rather is created within a > # ?function? > # > # ?many thanks! > # > # ?mattia > # > # ?> sessionInfo() > # ?R version 2.10.1 (2009-12-14) > # ?x86_64-unknown-linux-gnu > # > # ?locale: > # ? [1] LC_CTYPE=en_US.UTF-8 ? ? ? LC_NUMERIC=C > # ? [3] LC_TIME=en_US.UTF-8 ? ? ? ?LC_COLLATE=en_US.UTF-8 > # ? [5] LC_MONETARY=C ? ? ? ? ? ? ?LC_MESSAGES=en_US.UTF-8 > # ? [7] LC_PAPER=en_US.UTF-8 ? ? ? LC_NAME=C > # ? [9] LC_ADDRESS=C ? ? ? ? ? ? ? LC_TELEPHONE=C > # ?[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C > # > # ?attached base packages: > # ?[1] stats ? ? graphics ?grDevices utils ? ? datasets ?methods ? base > # > # ?other attached packages: > # ?[1] snow_0.3-3
ADD REPLYlink written 7.6 years ago by mattia pelizzola200
On 04/12/2010 09:25 AM, mattia pelizzola wrote: > thanks Martin for the explanations and thanks Norman for pointing out > that error in the example, > > unfortunately I am still stuck with the main problem: > I have to use clusterExport to export an object to the cluster nodes. It's hard to know without a (simple) example why you have to use clusterExport. If the object is not defined in the environment of the function, then a sure-fire way of getting it to your cluster nodes is to explicitly include it in the clusterApplyLB call fun4=function(startInd, endInd=3, data) data[startInd:endInd,] clusterApplyLB(cl, 1:3, fun4, data=Mat) Martin > clusterExport only seems to export objects from the GlobalEnv, > unfortunately. In my case this object is created within a function and > clusterExport is called within the same function, so the object is not > available in the GlobalEnv and I get error .. > > I'll try writing to the other mailing list, > thanks > > mattia > > On Sun, Apr 11, 2010 at 8:49 AM, Pavelka, Norman <nxp at="" stowers.org=""> wrote: >> Hi Mattia, >> >> Maybe I'm not getting what you're trying to do, but shouldn't your fun3 be using object 'data' rather than 'Mat' internally? >> >> HTH ;-) >> Norman >> >> # Hi, >> # >> # I have a simple function: >> # >> # > library(snow) >> # > fun2=function() { >> # + cl=makeCluster(3) >> # + Mat=matrix(2:10,3,3) >> # + fun3=function(startInd, endInd=3, data=Mat) {Mat[startInd:endInd,]} >> # + print(clusterApplyLB(cl, 1:3, fun3)) >> # + stopCluster(cl) >> # + } >> # >> # that is working fine: >> # >> # > fun2() >> # [[1]] >> # [,1] [,2] [,3] >> # [1,] 2 5 8 >> # [2,] 3 6 9 >> # [3,] 4 7 10 >> # >> # [[2]] >> # [,1] [,2] [,3] >> # [1,] 3 6 9 >> # [2,] 4 7 10 >> # >> # [[3]] >> # [1] 4 7 10 >> # >> # now, if I run the same commands outside the function: >> # >> # > cl=makeCluster(3) >> # > Mat=matrix(2:10,3,3) >> # > fun3=function(startInd, endInd=3, data=Mat) {Mat[startInd:endInd,]} >> # > print(clusterApplyLB(cl, 1:3, fun3)) >> # Error in checkForRemoteErrors(val) : >> # 3 nodes produced errors; first error: object 'Mat' not found >> # >> # so I figured out I have to export 'Mat' on the cluster nodes: >> # >> # > clusterExport(cl, 'Mat') >> # > print(clusterApplyLB(cl, 1:3, fun3)) >> # [[1]] >> # [,1] [,2] [,3] >> # [1,] 2 5 8 >> # [2,] 3 6 9 >> # [3,] 4 7 10 >> # >> # [[2]] >> # [,1] [,2] [,3] >> # [1,] 3 6 9 >> # [2,] 4 7 10 >> # >> # [[3]] >> # [1] 4 7 10 >> # >> # I still do not understand why clusterExport is NOT necessary within >> # the function 'fun2' and actually it would give an error: >> # >> # > rm(Mat) >> # > fun2=function() { >> # + cl=makeCluster(3) >> # + Mat=matrix(2:10,3,3) >> # + clusterExport(cl, 'Mat') >> # + fun3=function(startInd, endInd=3, data=Mat) {Mat[startInd:endInd,]} >> # + print(clusterApplyLB(cl, 1:3, fun3)) >> # + stopCluster(cl) >> # + } >> # > fun2() >> # Error in get(name, env = .GlobalEnv) : object 'Mat' not found >> # >> # >> # I found clusterExport to be the solution for a more complex example, >> # can I can't make it working within a function. >> # What is it happening here with clusterExport? and how can I export an >> # object that is not on my globalEnv but rather is created within a >> # function? >> # >> # many thanks! >> # >> # mattia >> # >> # > sessionInfo() >> # R version 2.10.1 (2009-12-14) >> # x86_64-unknown-linux-gnu >> # >> # locale: >> # [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C >> # [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 >> # [5] LC_MONETARY=C LC_MESSAGES=en_US.UTF-8 >> # [7] LC_PAPER=en_US.UTF-8 LC_NAME=C >> # [9] LC_ADDRESS=C LC_TELEPHONE=C >> # [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C >> # >> # attached base packages: >> # [1] stats graphics grDevices utils datasets methods base >> # >> # other attached packages: >> # [1] snow_0.3-3 -- Martin Morgan Computational Biology / Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 Location: Arnold Building M1 B861 Phone: (206) 667-2793
ADD REPLYlink written 7.6 years ago by Martin Morgan ♦♦ 20k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.2.0
Traffic: 216 users visited in the last hour