Search
Question: WARNING: remove set.seed usage in R code
0
gravatar for xyluo1991
7 weeks ago by
xyluo19910
xyluo19910 wrote:

Dear All,

Bioccheck gave me a warning of " WARNING: Remove set.seed usage in R code" because I used set.seed(seed) in my R function to reproduce results, where "seed" is an argument of my R function and can be specified by the user. To bypass Bioccheck, I have to remove the set.seed(seed) in my R function, but I want to let the user have the option to select a seed in my R function. How should I resolve this issue?

Thank you very much for your help!

Best,

Xiangyu 

ADD COMMENTlink modified 7 weeks ago by shepherl ♦♦ 650 • written 7 weeks ago by xyluo19910
1
gravatar for shepherl
7 weeks ago by
shepherl ♦♦ 650
United States
shepherl ♦♦ 650 wrote:

Generally we recommend the set.seed be done in the documentation and outside the function. Not only does this clearly display to the user that a seed is used, but an explanation of why the seed is used can also be provided to the user.


x <- function(){ some code}

 set.seed(123)

x()

You could keep the seed argument in your functions and clearly document. When you are submitting your package to the issue tracker, explain to the reviewer why a seed is set and your justification for keeping it in the function. It will be at your reviewers discretion if this will be allowed or not and they may insist on the former solution.

ADD COMMENTlink modified 7 weeks ago • written 7 weeks ago by shepherl ♦♦ 650
2

In addition to Lori's answer, here's a little anecdote.

I often perform simulations with randomly generated data to test the performance of various algorithms. I usually generate some data, test the method and compute some measure of performance; and repeat this for several iterations to ensure that I get representative estimates of the metric of interest. At one point, I noticed that the standard deviation of my metrics was extremely low. Why? Because someone had put set.seed inside their function, which affects the entire R session after the function call - this meant that my "randomly" generated data was always the same after the second iteration!

In short, it's always easy for users to call set.seed if they want to. But putting the set.seed inside functions can quietly lead to surprising side-effects in downstream code involving randomness. Moreover, it's much harder to "uncall" set.seed. Hence the advice from BiocCheck to not put set.seed inside the function.

ADD REPLYlink modified 7 weeks ago • written 7 weeks ago by Aaron Lun20k

The proper way would be to test whether .Random.seed exists, save and restore it upon exit. In my packages (which live on CRAN, not Bioconductor) I also tend to allow the user to request that the random seed not be set within the function, by supplying NULL to the argument randomSeed below.

foo = function(..., randomSeed=1)
{
    if (!is.null(randomSeed)) {
        if (exists(".Random.seed")) {
            savedSeed = .Random.seed
            on.exit(.Random.seed <<-savedSeed)
        }
        set.seed(randomSeed)
    }
    actual code...
}
ADD REPLYlink modified 7 weeks ago • written 7 weeks ago by Peter Langfelder1.5k

It seems by setting the random seed (I don't know the context, so could be off-base here) you're somehow overstating the reproducibility of foo() in the manner illustrated by Aaron's anecdote; it seems better to have NULL as the default?

Artificial, but

f = function() {
    .Random.seed <- 1
    function() {
        seed <- .Random.seed
        on.exit(.Random.seed <<- seed)
        rnorm(10)
    }
}

modifies the .Random.seed of the generator.

set.seed(123)
xx <- .Random.seed
res <- f()()
identical(xx, .Random.seed)  # FALSE

Maybe it's safer (since the user can manipulate the parent environment but not the location of .GlobalEnv in the search() path) with

f = function() {
    .Random.seed <- 1
    function() {
        seed <- get(".Random.seed", 1)
        on.exit(assign(".Random.seed", seed, 1))
        rnorm(10)
    }
}
ADD REPLYlink modified 7 weeks ago • written 7 weeks ago by Martin Morgan ♦♦ 22k

You're right, I didn't think of the possibility of calling code defining its own .Random.seed (which is probably not very frequent but certainly possible).

ADD REPLYlink written 6 weeks ago by Peter Langfelder1.5k

Thank you very much for the great replies! I have removed the set.seed within my function and explicitly state it outside. I greatly appreciate these helpful comment!

ADD REPLYlink written 25 days ago by xyluo19910
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.2.0
Traffic: 322 users visited in the last hour