Inconsistent Subseting using norm2Filter in FlowCore
1
0
Entering edit mode
peterfoster ▴ 20
@peterfoster-7470
Last seen 9.2 years ago
United States

I recently discovered that the application of (at least) `norm2Filter()` is not consistent when replicated.  I've pasted an example below.  In the example dataset the differences are small--just a few events.  In my much larger experimental datasets, however, the number of events changes by the hundreds and can significantly alter some of the downstream analysis.

## Loading example data
dat <- read.FCS(system.file("extdata","0877408774.B08", package="flowCore"))
n2f <- norm2Filter(filterId="myNorm2Filter", x=list("FSC-H", "SSC-H"), scale.factor=1)
xyplot(`FSC-H`~`SSC-H`, data=dat, filter=n2f, smooth=FALSE, xbin=256, stats=TRUE)
## Same filter, inconsistent subsetting.
sapply(1:15, function(x) {  fres <- Subset(dat, n2f); return(nrow(fres))  })

I soon realized that if I `set.seed()` prior to the subset, the issue goes away, and the same number of events (and presumably the same ones) are returned each time.

sapply(1:15, function(x) {  set.seed(1); fres <- Subset(dat, n2f); return(nrow(fres))  })

Is this because the `Subset()` command in combination with the `norm2Filter()` is using some kind of "training set" which is randomly selected?  How can I modify the `norm2Filter()` and/or `Subset()` functions to use the WHOLE dataset so that my analysis is not sensitive to the RNG?

 

flowcore gate subsetting filter • 1.4k views
ADD COMMENT
0
Entering edit mode
Jiang, Mike ★ 1.3k
@jiang-mike-4886
Last seen 3.1 years ago
(Private Address)

`%in%` method for `norm2filter` (the actual computing engine dispatched by 'Subset' method) uses 'CovMcd' function ('rrcov' package) to estimate the covariance matrix. 'CovMcd' does use random seed to sample data by default. I don't think we should change that behavior.

What you did was right: set seed explicitly before 'Subset` call.

ADD COMMENT

Login before adding your answer.

Traffic: 687 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6