I have done analysis on a dataset in which control samples were compared with treated samples. It is a small pilot which served to compare 2 technologies (Nanostring and Edgeseq). This is basically RNAseq data, so counts. What I have done is use TMM with quasi-likelihood testing and on the other hand quantile normalization with limma testing to look for significant genes in the contrast 'treated - control'. As far as I can judge, just usual pipelines for this kind of data.
The assumption for TMM is that the majority of the genes are not differentially expressed, for the quantile normalization that is not strictly the assumption, but it does assume, that samples have identical/similar data distributions and that global differences are due to technical variation.
What I find is that about 200-300 (I have used 2 pipelines, see above) of the panel of 460 genes are significantly changing in 'treated - control' That is quite a lot, I thought.
Can I still use the TMM normalization? Or is it robust enough to accomdate as low as 25% non-changing genes? Is there a sensible limit for the proportion of genes that should -not- change significantly?
Then ofcourse, I start wondering about the quantile normalization as well, because the control data distribution may be different from the treated data distribution, although I do not see that from the boxplots. So, I would say, quantile is fine to use.
Many thanks for your help and advice on this!