Dear All,
Bioccheck gave me a warning of " WARNING: Remove set.seed usage in R code" because I used set.seed(seed) in my R function to reproduce results, where "seed" is an argument of my R function and can be specified by the user. To bypass Bioccheck, I have to remove the set.seed(seed) in my R function, but I want to let the user have the option to select a seed in my R function. How should I resolve this issue?
Thank you very much for your help!
Best,
Xiangyu
In addition to Lori's answer, here's a little anecdote.
I often perform simulations with randomly generated data to test the performance of various algorithms. I usually generate some data, test the method and compute some measure of performance; and repeat this for several iterations to ensure that I get representative estimates of the metric of interest. At one point, I noticed that the standard deviation of my metrics was extremely low. Why? Because someone had put
set.seed
inside their function, which affects the entire R session after the function call - this meant that my "randomly" generated data was always the same after the second iteration!In short, it's always easy for users to call
set.seed
if they want to. But putting theset.seed
inside functions can quietly lead to surprising side-effects in downstream code involving randomness. Moreover, it's much harder to "uncall"set.seed
. Hence the advice from BiocCheck to not putset.seed
inside the function.The proper way would be to test whether .Random.seed exists, save and restore it upon exit. In my packages (which live on CRAN, not Bioconductor) I also tend to allow the user to request that the random seed not be set within the function, by supplying NULL to the argument randomSeed below.
It seems by setting the random seed (I don't know the context, so could be off-base here) you're somehow overstating the reproducibility of
foo()
in the manner illustrated by Aaron's anecdote; it seems better to haveNULL
as the default?Artificial, but
modifies the .Random.seed of the generator.
Maybe it's safer (since the user can manipulate the parent environment but not the location of .GlobalEnv in the search() path) with
You're right, I didn't think of the possibility of calling code defining its own .Random.seed (which is probably not very frequent but certainly possible).
Thank you very much for the great replies! I have removed the set.seed within my function and explicitly state it outside. I greatly appreciate these helpful comment!