I recently discovered that the application of (at least) norm2Filter() is not consistent when replicated.  I've pasted an example below.  In the example dataset the differences are small--just a few events.  In my much larger experimental datasets, however, the number of events changes by the hundreds and can significantly alter some of the downstream analysis.

n2f <- norm2Filter(filterId="myNorm2Filter", x=list("FSC-H", "SSC-H"), scale.factor=1)
xyplot(FSC-H~SSC-H, data=dat, filter=n2f, smooth=FALSE, xbin=256, stats=TRUE)
## Same filter, inconsistent subsetting.
sapply(1:15, function(x) {  fres <- Subset(dat, n2f); return(nrow(fres))  })

I soon realized that if I set.seed() prior to the subset, the issue goes away, and the same number of events (and presumably the same ones) are returned each time.

sapply(1:15, function(x) {  set.seed(1); fres <- Subset(dat, n2f); return(nrow(fres))  })

Is this because the Subset() command in combination with the norm2Filter() is using some kind of "training set" which is randomly selected?  How can I modify the norm2Filter() and/or Subset() functions to use the WHOLE dataset so that my analysis is not sensitive to the RNG?

%in% method for norm2filter (the actual computing engine dispatched by 'Subset' method) uses 'CovMcd' function ('rrcov' package) to estimate the covariance matrix. 'CovMcd' does use random seed to sample data by default. I don't think we should change that behavior.

What you did was right: set seed explicitly before 'Subset` call.