I am not sure there is a 'best' way to analyze NanoString data, particularly the normalization step.
Here are some observations from a set of recent analyses that we performed. With a sufficiently large number of genes (we had a set of data with ~600 genes, with technical replicates for each sample), we found that normalizing to total counts using voom() gave better results than normalizing to the geometric mean of the supplied housekeeping genes, as well as guessing at housekeeping genes using a set of low variance genes. Using an additional quantile or cyclic loess normalization improved results moderately, but at the cost of violating Occam's razor, so we chose to just use voom() without further normalization.
Note here that we assessed the results by comparing the technical replicates, and went with the normalization that did the best job of making the technical replicates look more similar to each other.
For this first set of samples we are making the assumption that most of the genes are being expressed, and most are being expressed at relatively similar levels, so normalizing to library size may be a reasonable thing to do. We analyzed a smaller panel of genes (~200) that was more focused, and for which we couldn't assume that the genes were in general being expressed at the same levels (in other words, a particular treatment may well have reduced the expression of most of these genes, in which case normalizing to library size would have unduly affected the biological differences). In this situation we had to use the NanoString-supplied housekeeping genes.
In your situation, NanoString supplies both mRNA housekeeping genes, as well as some probes against non-mammalian small RNAs. I assume you spike in the non-mammalian small RNAs, like the ERCC probes. However, the spiked-in RNA only gives you information about the variability introduced after the spike-in occurred (plus, if you are spiking in using small volumes and a Rainin pipettor, then you are likely introducing more variability in that step than you think). So you have mRNA housekeeping genes that may correlate with the amount of miRNA that you originally started with, and some spike-ins that help assess variability introduced later in the process.
In my experience, most miRNAs seem not to be expressed, and those that are expressed appear to be at low levels. This is based on results from Affy's miRNA arrays, which are sort of questionable, but that seems to be the expectation for miRNAs, so I don't think it is far from the truth.
So all of the normalization methods have pretty serious issues for the miRNA panel, and I think you have to choose which one you think is the least worst. But do note that if you decide to use limma-voom to analyze the data, and you want to normalize using something other than total counts, you have to take steps to keep voom() from doing so.
In other words, if you just take your counts and feed into voom(), then by default you will compute the library sizes and then compute cpm, which is normalizing to library size. If you don't want that to happen, you need to give some value for the 'lib.size' argument to voom().