Sean Davis <sdavis2 at="" ...=""> writes:
>
> On Wed, Nov 19, 2008 at 10:18 AM, Michael Walter <
> michael.walter <at> med.uni-tuebingen.de> wrote:
>
> > Dear List,
> >
> > We run our first slide of illumina's infinium methylation arrays.
After
> > searching the archive, I still have some general questions how to
best
> > analyze the data.
> >
> > First of all, I'm would like to know some opinion on
normalization. In my
> > personal and probably simplistic view I'd think that normalization
is not
> > necessary since the value you get from the array is a ratio which
is sample
> > inherent (unlike a classical two-color expression array where you
mix two
> > samples to generate the expression ratio). Is this assumption
correct or am
> > I missing some important aspect?
> >
>
> Unfortunately, there is a significant dye-bias issue. That is,
there is a
> propensity for one dye to be brighter than the other and it appears
that
> Illumina does not adequately correct for this bias.
>
> >
I agree, the dye-bias should be a problem, but in my case,
confirmatory
bisulfite sequencing of interesting probes reported methylation values
very
close to those from the methylation array, so I stopped worrying about
this.
> > Anyway, I'd like to perform background normalization which results
as usual
> > with illumina arrays in some negative values. Does anyone one have
a neat
> > solution for this problem or shall I just skip the probes?
> >
>
> I have been just ignoring those probes.
>
> >
> > Do I have to correct for some dye effect like for the golden gate
> > methylation assay? Since the probes for methylated and
unmethylated DNA
> > incorporate the same dye this shouldn't be an issue?
> >
>
> See above.
>
> >
> > My final question is basically the most pressing: What kind of
statistic
> > test should I use? Since all the values are ratios between 0 and 1
I have a
> > real bad feeling by simply running some t-tests. And if a t-test
is the
> > proper choice, shall I log-transform the data?
> >
>
> The t-statistic should still be valid, I think. The assumptions
that go
> into statistics like the t-stat are not based on the distribution of
the
> data, but on differences between values. I think these assumption
probably
> still holds in practice for these data. However, I have not tried
to prove
> things one way or the other. Of course, if you are concerned about,
> non-parametric testing will alleviate these concerns.
>
> Sean
>
I too had the same dilemma over which statistic to use with the
GoldenGate
Methylation array. As I understand it, for a t test to be valid, the
underlying
population has to be normally distributed and this is manifestly not
the case
for the majority of probes, at least on the GoldenGate methylation
array, not
least because, as you say, the distribution is constrained between
values of 0
and 1, with most probes being unmethylated (close to 0) or methylated
(close to
1). The distribution is best described by a beta distribution or a mix
of beta
distributions, depending on whether probe distribution is uni- or bi-
modal.
Therefore, I used a Mann-Whitney test followed by filtering on the
basis of the
magnitude of difference in methylation value between clusters to
identify
interesting probes.
Having said that, carrying out t tests did identify essentially the
same set of
interesting probes...
Good luck with your analysis,
and I'd be interested in hearing whether people agree/disagree with
the t test
question.
Best wishes,
Ed Schwalbe
> >
> > Any input and shared experience with this type of array is highly
> > appreciated.
> >
> >
> > Best Regards,
> >
> >
> > Mike
> > --
> > Dr. Michael Walter
> > _______________________________________________ > > Bioconductor mailing list > > Bioconductor@stat.math.ethz.ch > > https://stat.ethz.ch/mailman/listinfo/bioconductor > > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > > > > -- > Dr. Michael Walter > > The Microarray Facility > University of Tuebingen > Calwerstr. 7 > 72076 Tübingen/GERMANY > > Tel.: +49 (0) 7071 29 83210 > Fax. + 49 (0) 7071 29 5228 > > Confidentiality Note:\ This message is intended only for...{{dropped:9}} > > _______________________________________________ > Bioconductor mailing list > Bioconductor@stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > [[alternative HTML version deleted]]
> > > _______________________________________________ > > > Bioconductor mailing list > > > Bioconductor at stat.math.ethz.ch > > > https://stat.ethz.ch/mailman/listinfo/bioconductor > > > Search the archives: > > http://news.gmane.org/gmane.science.biology.informatics.conductor > > > > > > > -- > > Dr. Michael Walter > > > > The Microarray Facility > > University of Tuebingen > > Calwerstr. 7 > > 72076 T?bingen/GERMANY > > > > Tel.: +49 (0) 7071 29 83210 > > Fax. + 49 (0) 7071 29 5228 > > > > Confidentiality Note:\ This message is intended only for...{{dropped:9}} > > > > _______________________________________________ > > Bioconductor mailing list > > Bioconductor at stat.math.ethz.ch > > https://stat.ethz.ch/mailman/listinfo/bioconductor > > Search the archives: > > http://news.gmane.org/gmane.science.biology.informatics.conductor > > > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor