Question: Limit on significant genes for TMM normalization
gravatar for bilcodygm
20 months ago by
bilcodygm0 wrote:

I have done analysis on a dataset in which control samples were compared with treated samples. It is a small pilot which served to compare 2 technologies (Nanostring and Edgeseq). This is basically RNAseq data, so counts. What I have done is use TMM with quasi-likelihood testing and on the other hand quantile normalization with limma testing to look for significant genes in the contrast 'treated - control'. As far as I can judge, just usual pipelines for this kind of data.

The assumption for TMM is that the majority of the genes are not differentially expressed, for the quantile normalization that is not strictly the assumption, but it does assume, that samples have identical/similar data distributions and that global differences are due to technical variation.

What I find is that about 200-300 (I have used 2 pipelines, see above) of the panel of 460 genes are significantly changing in 'treated - control' That is quite a lot, I thought.

Can I still use the TMM normalization? Or is it robust enough to accomdate as low as 25% non-changing genes? Is there a sensible limit for the proportion of genes that should -not- change significantly?

Then ofcourse, I start wondering about the quantile normalization as well, because the control data distribution may be different from the treated data distribution, although I do not see that from the boxplots. So, I would say, quantile is fine to use.

Many thanks for your help and advice on this!

ADD COMMENTlink modified 20 months ago by Gordon Smyth39k • written 20 months ago by bilcodygm0
Answer: Limit on significant genes for TMM normalization
gravatar for Gordon Smyth
20 months ago by
Gordon Smyth39k
Walter and Eliza Hall Institute of Medical Research, Melbourne, Australia
Gordon Smyth39k wrote:

TMM and quantile normalization are not substantially different regarding the number of DE genes that they can tolerate.

Neither method is intended to tolerate 75% DE genes. You already know this, as you have correctly stated that TMM assumes the majority of genes to be non-DE.

However normalizing is still better than not normalizing, even with 75% DE genes. And the results might be good enough if up and down genes are more or less equally represented.

Quantile is the more robust of the two, as it normalizes for nonlinear effects as well as simple scaling. But TMM is a bit better at tolerating DE changes that are unbalanced, predominately in one direction.

When you compare two different technologies, I'd expect all the genes to be DE. So the aim should be to quantify the size of the differences rather than to test formally for DE.

ADD COMMENTlink modified 20 months ago • written 20 months ago by Gordon Smyth39k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 394 users visited in the last hour