It's not an issue of scale, really. You could convert to z-scores, in which case all the data would be N(0,1) and any cluster would almost certainly put the RNA-Seq out by itself.
In other words, the measures of gene expression that you get from microarrays and RNA-Seq are at best correlated with the underlying gene expression, and aren't a direct measure of the gene expression and can't be compared directly. I wouldn't even try to combine microarray data from different experiments, let alone completely different ways of measuring the gene expression.
As an example, in the more recent Affy arrays there are a set of anti-genomic probes that are designed to have no complementary sequences in any organism, and hence are not expected to bind to anything in a biological sample. These anti-genomic probes vary from almost pure AT to almost pure GC content. And as the GC content goes up, the binding goes up, to a saturated signal. So if you have an Affy probe that has super high GC content, it will bind to, like, anything. And that signal has nothing to do with a measurement of gene expression because these probes aren't designed to measure any gene expression! So for Affy probes, the signal you get is some combination of underlying transcript abundance, and just random binding that goes up as the GC content increases.
If you assume that the GC-specific binding is pretty consistent between samples, then that all comes out in the wash when you compare groups (well not exactly - as the GC-specific binding increases, your apparent fold change decreases - but algebraically it gets subtracted out). RNA-Seq has its own biases, that are different from microarray biases, and simply scaling the data to have the same distribution won't correct for those biases.
This was explained, and nevertheless the request is to attempt to see the similarity of the RNA-Seq sample to one of the microarray conditions. Perhaps if similarity to one of them will be apparent, it will be a cue to perform a similarity check to this condition with other, more suitable tools.
Because there is only one RNA-Seq sample, the RNA-Seq signals cannot be converted to z-score.
Is converting to log2-tpm, and quantile-normalizing together with the microarray samples reasonable?
Also, perhaps it is possible to use voom with one sample only?
It's possible to run
voom
with one sample, but you will find it's the same as runningcpm
with a prior count of 0.5 and log = TRUE.And I have already said that there are biases that are confounded by technology, so any comparisons are some combination of the underlying gene expression and technical differences. And these differences shouldn't be expected to be monotonic, so a quantile normalization is probably futile.
But you seem bound and determined to do something, so why ask? Just do.