Question: Inconsistent Subseting using norm2Filter in FlowCore
0
gravatar for peterfoster
4.2 years ago by
peterfoster20
United States
peterfoster20 wrote:

I recently discovered that the application of (at least) `norm2Filter()` is not consistent when replicated.  I've pasted an example below.  In the example dataset the differences are small--just a few events.  In my much larger experimental datasets, however, the number of events changes by the hundreds and can significantly alter some of the downstream analysis.

## Loading example data
dat <- read.FCS(system.file("extdata","0877408774.B08", package="flowCore"))
n2f <- norm2Filter(filterId="myNorm2Filter", x=list("FSC-H", "SSC-H"), scale.factor=1)
xyplot(`FSC-H`~`SSC-H`, data=dat, filter=n2f, smooth=FALSE, xbin=256, stats=TRUE)
## Same filter, inconsistent subsetting.
sapply(1:15, function(x) {  fres <- Subset(dat, n2f); return(nrow(fres))  })

I soon realized that if I `set.seed()` prior to the subset, the issue goes away, and the same number of events (and presumably the same ones) are returned each time.

sapply(1:15, function(x) {  set.seed(1); fres <- Subset(dat, n2f); return(nrow(fres))  })

Is this because the `Subset()` command in combination with the `norm2Filter()` is using some kind of "training set" which is randomly selected?  How can I modify the `norm2Filter()` and/or `Subset()` functions to use the WHOLE dataset so that my analysis is not sensitive to the RNG?

 

flowcore filter subsetting gate • 574 views
ADD COMMENTlink modified 4.2 years ago by Jiang, Mike1.2k • written 4.2 years ago by peterfoster20
Answer: Inconsistent Subseting using norm2Filter in FlowCore
0
gravatar for Jiang, Mike
4.2 years ago by
Jiang, Mike1.2k
(Private Address)
Jiang, Mike1.2k wrote:

`%in%` method for `norm2filter` (the actual computing engine dispatched by 'Subset' method) uses 'CovMcd' function ('rrcov' package) to estimate the covariance matrix. 'CovMcd' does use random seed to sample data by default. I don't think we should change that behavior.

What you did was right: set seed explicitly before 'Subset` call.

ADD COMMENTlink written 4.2 years ago by Jiang, Mike1.2k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 414 users visited in the last hour