Entering edit mode
Dear Marcelo,
There are at least three different issues merged together in your
question,
which is one reason why your post has prompted so many replies. The
issues are
1. Producing NAs during background correction
2. Quantile normalization on log or raw scale
3. Differential expression analysis (linear modelling) on log or raw
scale
Let's consider these in turn:
1. You haven't said anything about background correction. If you are
planning to use quantile normalization, it is absolutely essential
that you
avoid creating negative or zero intensities during the background
correction process. (Unfortunately I don't think that this point is
made
explicitly anywhere in the limma documentation, although it has been
said
several times on the Bioconductor mailing list.) See the function
backgroundCorrect() for some options.
2. There exist no clear results on whether it is best to carry out
quantile
normalization on the raw or log scale. The function
normalizeBetweenArrays() in the limma package is set up to quantile
normalize on the log-scale. However the very successful RMA algorithm
for
Affymetrix data normalizes quantiles on the raw scale. I am slowly
coming
around to the idea that quantile normalization might be slightly
better on
the raw scale. So raw or log scale is optional. Note however, if you
normalize on the log-scale, you absolutely must avoid NAs
corresponding to
negative intensities -- see point 1. Using quantile normalization on
data
which contains NAs arising from negative intensities is wrong.
3. However you background correct, and however you normalize, there is
over-whelming evidence that linear modelling analysis, such as that
done by
the package, is better done on the log-scale. This is because the
variances
are more nearly stabilized on the log scale than on the raw scale.
This is
separate from point 2.
Gordon
----------- original message ------------
Marcelo Luiz de Laia mlaia at fcav.unesp.br
Fri Apr 1 20:20:17 CEST 2005
>Dear Bioconductors Friends,
>
>I have a question that I dont found answer for it. Please, if you
have a
>paper/article that explain it, please, tell me.
>
>I normalize our data using normalize.quantile function.
>
>If I previous transform our intensities (single channel) in log2, I
dont
>get differentially genes in limma.
>
>But, if I dont transform our data, I get some genes with p.value
around
>0.0001, thats is great!
>
>Of course, when I transform the intensities data to log2, I get some
NA.
>
>Why are there this difference? Am I wrong in does an analysis with
not
>loged data?
>
>Thanks a lot
>
>Marcelo