I am interested in details about the random component in probe selection at both the intra- and inter-array levels for preprocessSWAN() in minfi.
From the original paper I don't notice any explanation of explicit inter-array normalization, just that the intra-array normalization reduces technical variability between arrays. However, I find the following clause in documentation for minfi's preprocessSWAN function:
"SWAN uses a random subset of probes to do the between array normalization. In order to achive reproducible results, the seed needs to be set using set.seed."
So I am wondering:
1. Why is it unnecessary to set the seed for intra-array normalization and not just inter-array normalization? Is this because differences in replication are negligible? Should I be worried about the effects of random intra-array probe selection in hindering reproducibility?
2. What is happening in the inter-array normalization? I understand SWAN (Subset-Quantile Within Array Normalization) selects subsets of probes with varying levels of internal CpGs in order to define a kind of intensity distributions of each assay type, to which remaining probes on the array of subset are normalized. So what aspect of this process is used in the inter-array normalization?
Thanks as always.