For all intents and purposes, unless you are using a method that can take advantage of the observation weights that come out of
voom(), "voom transformed data" is essentially just "log2cpm" with a small prior count (0.5).
The problem with that is that you will have more variance around the lower expression values of your log2cpm data with such a small prior count, but your downstream analyses tools will likely expect data to more homoscedastic. This is OK for voom, because the weights are incorporated in the analysis, but they are likely not in your KNN procedure, or whatever else you want to throw at it.
As you call
edgeR::cpm(y, log = TRUE, prior.count = N) with larger and larger values of
N you will "hammer out" more and more the variance at the low end of expression, and you will find that it is often suggested on this support form to use a value for
prior.count between 3 and 5 to get your data "approximately" where you want it to be prior to feeding it into some clustering, pca, or whatever else algorithm you choose to run -- so you should prefer to use this approach as opposed to the "voomed"
Another approach is to use the output from the
vst (variance stabilization transform) method found in the DESeq2 package to do the same. Perhaps you can think of the
vst transformation in DESeq2 as similar to the
edgeR::cpm(y, log = TRUE, prior.count = N) but the value of
N isn't constant throughout, which is to say that its value adapts in some smart way within the vst procedure itself.