Question

Re: quantile normalization vs. data distributions

0

Entering edit mode

Leslie Cope ▴ 20

@leslie-cope-683

Last seen 9.6 years ago

The tests already take sample size into account, which is part of the problem. If two datasets really come from the exact same distribution, then as sample size increases, histograms, density plots, summary statistics and so on will get closer and closer to one another. The tests take this into account. This becomes a problem in our case because we know that even with the large number of genes on a chip, there are differences in distribution from chip to chip. Some of these differences don't matter for quantile normalization. For example a simple difference in means would obviously not be a problem for quantile normalization. Nor would a simple difference in variance. These and more complicated differences between distribution can be accounted for when building tests, but the standard tests themselves are blind and can't tell distributional differences we care about from those we don't. And for that matter, it is evident from recent discussion in this forum that no one is sure which differences we should care about and which don't matter. Trying to figure out is the whole point of this thread. Because of that I suspect that you will not get a nice clean answer to your first question at this time. Leslie Cope, Ph.D. Oncology Biostatistics, JHU > 2. As a non-statistician I'm a bit confused that statistical test will > nearly > always find a significant difference between distributions when the > samples > are large (I remember someone mentioned this to me - without explanations > - > about 2 years ago in a posting to the R-list). Is there a way to > "normalize" > the test results (e.g. the p-values) by the size of the sample? > > I guess such a significant difference as reported by a test is a *real* > difference (otherwise all statistical test would be worthless ...). Can > one > assume, that even if the two distributions are statistically different, > one > can treat them as equal judged by visuall investigatigation of a density > plot > or histogram?

Normalization Normalization • 613 views

ADD COMMENT • link 20.1 years ago Leslie Cope ▴ 20