Technically the quantiles are the (0,25,50,75,100) percentiles of the data <- That is like, not correct.
Quantiles can be computed using the
quantile function. Using the data from the vignette:
> ps <- getStats(BS.cancer.ex.tstat)[,"tstat.corrected"]
0% 25% 50% 75% 100%
-10.9073281 -0.5864714 0.1185087 0.8609352 14.3986699
But I doubt that you want to use any of those percentiles as a cutoff. More likely you want to look at the (95, 99, 99.5) percentiles instead.
> quantile(ps, c(0,0.005, 0.01, 0.05, 0.95,0.99, 0.995,1))
0% 0.5% 1% 5% 95% 99% 99.5%
-10.907328 -4.115913 -3.447037 -1.998064 2.622552 5.918016 8.127983
And you could exclude all but about 2% of the regions under contention by using the cutoffs that are used in the vignette
> sum(ps < -4.6 | ps > 4.6)/length(ps)
You could use the p-value for those t-statistics, but it wouldn't be materially different than using the t-statistics directly.
The idea here is pretty simple - you just want to get a relatively small number of CpG sites that look to be differentially methylated, and then see if any of them are all together in one or more regions. How you select the CpG sites will necessarily be pretty ad hoc, and you just want to make sure you don't have too many (don't want too many false positives creeping in) or too few (you do want some results, no?). In the end you need some criterion to say if a CpG is in or out, and using the t-stat or the p-value will result in pretty much the same thing.
To see what I mean, you can make a 'volcano plot' of the data:
> plot(ps, -log10(pt(abs(ps), 8, lower.tail = FALSE)))
> abline(v = c(-4.6, 4.6))
The cutoff in the vignette is choosing all the CpG sites outside of those two vertical lines. Using a p-value cutoff would be the same as putting a horizontal line in there somewhere, at say
> abline(h = 3.1)
And taking all the CpG sites above that line. Which just so happens to be the same exact CpG sites.