Dear all
unfortunately I did not get any reply on my post, so thats why I am
asking
again,
assuming that lots of people already came across that problem.
Working with an array set ( cDNA or any single color platform) just
means
that the probes you are interested in, are spread out over more than
one
array
(usually due to space limitations),
So sample samples, but different features.
But actually that kind of separation of the probes is rather random.
The question arises at which level of the analysis the arrays should
be
aggregated.
I think the normalization and also the model fitting should be done
separately.
But as we do not only consider contrasts within each array of the
array set,
but at the contrast,
we want to look at the results of all arrays at the same time, the
p-values
must be adjusted somehow for
this array-effect.
To do this in a "global" manner similar to the "global method" of
decide.tests will probably result in being overly
conservative.
Any suggestions?
Best,
Tefina
[[alternative HTML version deleted]]
On Fri, Sep 11, 2009 at 8:58 AM, Tefina Paloma
<tefina.paloma@gmail.com>wrote:
> Dear all
>
> unfortunately I did not get any reply on my post, so thats why I am
asking
> again,
> assuming that lots of people already came across that problem.
>
> Working with an array set ( cDNA or any single color platform) just
means
> that the probes you are interested in, are spread out over more than
one
> array
> (usually due to space limitations),
> So sample samples, but different features.
>
> But actually that kind of separation of the probes is rather random.
> The question arises at which level of the analysis the arrays should
be
> aggregated.
>
> I think the normalization and also the model fitting should be done
> separately.
>
> But as we do not only consider contrasts within each array of the
array
> set,
> but at the contrast,
> we want to look at the results of all arrays at the same time, the
p-values
> must be adjusted somehow for
> this array-effect.
>
> To do this in a "global" manner similar to the "global method" of
> decide.tests will probably result in being overly
> conservative.
>
> Any suggestions?
>
>
Why not just normalize each array in the set separately and then
combine the
normalized data for analysis? I'm not sure I see why the arrays would
need
to be treated independently for analysis, assuming the technology was
the
same for each array in the set.
Sean
[[alternative HTML version deleted]]
To be able to fit the same model to all arrays, an additional between-
array
normalization would be necessary, so to make all the arrays really
comparable
and I don't want to over-normalize the data either.....
therefore I just thought of an sensible p value adjustment
2009/9/11 Sean Davis <seandavi@gmail.com>
>
>
> On Fri, Sep 11, 2009 at 8:58 AM, Tefina Paloma
<tefina.paloma@gmail.com>wrote:
>
>> Dear all
>>
>> unfortunately I did not get any reply on my post, so thats why I am
asking
>> again,
>> assuming that lots of people already came across that problem.
>>
>> Working with an array set ( cDNA or any single color platform) just
means
>> that the probes you are interested in, are spread out over more
than one
>> array
>> (usually due to space limitations),
>> So sample samples, but different features.
>>
>> But actually that kind of separation of the probes is rather
random.
>> The question arises at which level of the analysis the arrays
should be
>> aggregated.
>>
>> I think the normalization and also the model fitting should be done
>> separately.
>>
>> But as we do not only consider contrasts within each array of the
array
>> set,
>> but at the contrast,
>> we want to look at the results of all arrays at the same time, the
>> p-values
>> must be adjusted somehow for
>> this array-effect.
>>
>> To do this in a "global" manner similar to the "global method" of
>> decide.tests will probably result in being overly
>> conservative.
>>
>> Any suggestions?
>>
>>
> Why not just normalize each array in the set separately and then
combine
> the normalized data for analysis? I'm not sure I see why the arrays
would
> need to be treated independently for analysis, assuming the
technology was
> the same for each array in the set.
>
> Sean
>
>
[[alternative HTML version deleted]]
On Fri, Sep 11, 2009 at 9:47 AM, Tefina Paloma
<tefina.paloma@gmail.com>wrote:
> To be able to fit the same model to all arrays, an additional
between-array
> normalization would be necessary, so to make all the arrays really
> comparable
> and I don't want to over-normalize the data either.....
>
> therefore I just thought of an sensible p value adjustment
>
>
You can adjust the entire list of p-values from all lists, if you
like, as
an alternative. However, assuming that the arrays are of the same
technology, the probe-level variances should be similar, so you could
also
combine the normalized data. I'm not sure what "model" you mean, as
each
test is done within a probe and, therefore, would not cross arrays.
But I
may have misunderstood what you are trying to do.
Sean
> 2009/9/11 Sean Davis <seandavi@gmail.com>
>
> >
> >
> > On Fri, Sep 11, 2009 at 8:58 AM, Tefina Paloma
<tefina.paloma@gmail.com> >wrote:
> >
> >> Dear all
> >>
> >> unfortunately I did not get any reply on my post, so thats why I
am
> asking
> >> again,
> >> assuming that lots of people already came across that problem.
> >>
> >> Working with an array set ( cDNA or any single color platform)
just
> means
> >> that the probes you are interested in, are spread out over more
than one
> >> array
> >> (usually due to space limitations),
> >> So sample samples, but different features.
> >>
> >> But actually that kind of separation of the probes is rather
random.
> >> The question arises at which level of the analysis the arrays
should be
> >> aggregated.
> >>
> >> I think the normalization and also the model fitting should be
done
> >> separately.
> >>
> >> But as we do not only consider contrasts within each array of the
array
> >> set,
> >> but at the contrast,
> >> we want to look at the results of all arrays at the same time,
the
> >> p-values
> >> must be adjusted somehow for
> >> this array-effect.
> >>
> >> To do this in a "global" manner similar to the "global method" of
> >> decide.tests will probably result in being overly
> >> conservative.
> >>
> >> Any suggestions?
> >>
> >>
> > Why not just normalize each array in the set separately and then
combine
> > the normalized data for analysis? I'm not sure I see why the
arrays
> would
> > need to be treated independently for analysis, assuming the
technology
> was
> > the same for each array in the set.
> >
> > Sean
> >
> >
>
> [[alternative HTML version deleted]]
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor@stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives:
> http://news.gmane.org/gmane.science.biology.informatics.conductor
>
[[alternative HTML version deleted]]
On Fri, Sep 11, 2009 at 11:20 AM, Sean Davis <seandavi@gmail.com>
wrote:
>
>
> On Fri, Sep 11, 2009 at 9:47 AM, Tefina Paloma
<tefina.paloma@gmail.com>wrote:
>
>> To be able to fit the same model to all arrays, an additional
>> between-array
>> normalization would be necessary, so to make all the arrays really
>> comparable
>> and I don't want to over-normalize the data either.....
>>
>> therefore I just thought of an sensible p value adjustment
>>
>>
> You can adjust the entire list of p-values from all lists, if you
like, as
> an alternative. However, assuming that the arrays are of the same
> technology, the probe-level variances should be similar, so you
could also
> combine the normalized data. I'm not sure what "model" you mean, as
each
> test is done within a probe and, therefore, would not cross arrays.
But I
> may have misunderstood what you are trying to do.
>
>
I made a further assumption above, which I should probably make
explicit.
While the array technology is important in determing the variance, the
biologic behavior of the probes on the array contributes, also. If
the
biologic behavior of probes on one array is expected to be "different"
in
some way, then the assumption of approximately equal variance will be
violated. Then I agree that doing an analysis "within array" is the
best
way to go.
Sean
> 2009/9/11 Sean Davis <seandavi@gmail.com>
>>
>> >
>> >
>> > On Fri, Sep 11, 2009 at 8:58 AM, Tefina Paloma
<tefina.paloma@gmail.com>> >wrote:
>> >
>> >> Dear all
>> >>
>> >> unfortunately I did not get any reply on my post, so thats why I
am
>> asking
>> >> again,
>> >> assuming that lots of people already came across that problem.
>> >>
>> >> Working with an array set ( cDNA or any single color platform)
just
>> means
>> >> that the probes you are interested in, are spread out over more
than
>> one
>> >> array
>> >> (usually due to space limitations),
>> >> So sample samples, but different features.
>> >>
>> >> But actually that kind of separation of the probes is rather
random.
>> >> The question arises at which level of the analysis the arrays
should be
>> >> aggregated.
>> >>
>> >> I think the normalization and also the model fitting should be
done
>> >> separately.
>> >>
>> >> But as we do not only consider contrasts within each array of
the array
>> >> set,
>> >> but at the contrast,
>> >> we want to look at the results of all arrays at the same time,
the
>> >> p-values
>> >> must be adjusted somehow for
>> >> this array-effect.
>> >>
>> >> To do this in a "global" manner similar to the "global method"
of
>> >> decide.tests will probably result in being overly
>> >> conservative.
>> >>
>> >> Any suggestions?
>> >>
>> >>
>> > Why not just normalize each array in the set separately and then
combine
>> > the normalized data for analysis? I'm not sure I see why the
arrays
>> would
>> > need to be treated independently for analysis, assuming the
technology
>> was
>> > the same for each array in the set.
>> >
>> > Sean
>> >
>> >
>>
>> [[alternative HTML version deleted]]
>>
>> _______________________________________________
>> Bioconductor mailing list
>> Bioconductor@stat.math.ethz.ch
>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> Search the archives:
>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>>
>
>
[[alternative HTML version deleted]]
Hi Sean,
On Sep 11, 2009, at 11:44 AM, Sean Davis wrote:
> On Fri, Sep 11, 2009 at 11:20 AM, Sean Davis <seandavi at="" gmail.com="">
> wrote:
>
>>
>>
>> On Fri, Sep 11, 2009 at 9:47 AM, Tefina Paloma <tefina.paloma at="" gmail.com="">> >wrote:
>>
>>> To be able to fit the same model to all arrays, an additional
>>> between-array
>>> normalization would be necessary, so to make all the arrays really
>>> comparable
>>> and I don't want to over-normalize the data either.....
>>>
>>> therefore I just thought of an sensible p value adjustment
>>>
>>>
>> You can adjust the entire list of p-values from all lists, if you
>> like, as
>> an alternative. However, assuming that the arrays are of the same
>> technology, the probe-level variances should be similar, so you
>> could also
>> combine the normalized data. I'm not sure what "model" you mean,
>> as each
>> test is done within a probe and, therefore, would not cross
>> arrays. But I
>> may have misunderstood what you are trying to do.
>>
>>
> I made a further assumption above, which I should probably make
> explicit.
> While the array technology is important in determing the variance,
the
> biologic behavior of the probes on the array contributes, also.
Sorry if this is too noob-ish of a question, but I'm curious about
your choice of words. Could you explain this point a bit further? It
sounds like you are referring to the actual probes that are
synthesized onto the array, no?
What biologic behavior do you expect these probes to have? Are you
referring to them forming some secondary structure or something? If
so, why would one expect some explicitly differing behavior between
the same probes on different arrays (assuming no array impurities and
the arrays were performed using the same protocol, or whatever).
Just curious, thanks ...
-steve
--
Steve Lianoglou
Graduate Student: Computational Systems Biology
| Memorial Sloan-Kettering Cancer Center
| Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact
On Fri, Sep 11, 2009 at 11:56 AM, Steve Lianoglou <
mailinglist.honeypot@gmail.com> wrote:
> Hi Sean,
>
> On Sep 11, 2009, at 11:44 AM, Sean Davis wrote:
>
> On Fri, Sep 11, 2009 at 11:20 AM, Sean Davis <seandavi@gmail.com>
wrote:
>>
>>
>>>
>>> On Fri, Sep 11, 2009 at 9:47 AM, Tefina Paloma
<tefina.paloma@gmail.com>>> >wrote:
>>>
>>> To be able to fit the same model to all arrays, an additional
>>>> between-array
>>>> normalization would be necessary, so to make all the arrays
really
>>>> comparable
>>>> and I don't want to over-normalize the data either.....
>>>>
>>>> therefore I just thought of an sensible p value adjustment
>>>>
>>>>
>>>> You can adjust the entire list of p-values from all lists, if
you like,
>>> as
>>> an alternative. However, assuming that the arrays are of the same
>>> technology, the probe-level variances should be similar, so you
could
>>> also
>>> combine the normalized data. I'm not sure what "model" you mean,
as each
>>> test is done within a probe and, therefore, would not cross
arrays. But
>>> I
>>> may have misunderstood what you are trying to do.
>>>
>>>
>>> I made a further assumption above, which I should probably make
>> explicit.
>> While the array technology is important in determing the variance,
the
>> biologic behavior of the probes on the array contributes, also.
>>
>
> Sorry if this is too noob-ish of a question, but I'm curious about
your
> choice of words. Could you explain this point a bit further? It
sounds like
> you are referring to the actual probes that are synthesized onto the
array,
> no?
>
> What biologic behavior do you expect these probes to have? Are you
> referring to them forming some secondary structure or something? If
so, why
> would one expect some explicitly differing behavior between the same
probes
> on different arrays (assuming no array impurities and the arrays
were
> performed using the same protocol, or whatever).
>
>
The classic example that I can think of is the hgu133a and b where the
probes on the a array were "refseq-based" and so represented well-
validated
genes while the probes on the b array were generally ESTs and, being
less
"qualified" as probesets, had much different error qualities than
those on
the a array. If using something like limma or SAM that has some sort
of
"variance pooling", the variances will be inflated in one array of the
set
and decreased in the other array of the set.
I hope that helps. I have done a particularly bad job of explaining
myself
above--sorry about confusion.
Sean
[[alternative HTML version deleted]]
Howdy,
On Sep 11, 2009, at 12:01 PM, Sean Davis wrote:
> The classic example that I can think of is the hgu133a and b where
> the probes on the a array were "refseq-based" and so represented
> well-validated genes while the probes on the b array were generally
> ESTs and, being less "qualified" as probesets, had much different
> error qualities than those on the a array. If using something like
> limma or SAM that has some sort of "variance pooling", the variances
> will be inflated in one array of the set and decreased in the other
> array of the set.
Wow ... I never used them, but I didn't know that part of hgu133*'s
history ... thanks for the lesson!
> I hope that helps. I have done a particularly bad job of explaining
> myself above--sorry about confusion.
Sure it helped, thanks.
-steve
--
Steve Lianoglou
Graduate Student: Computational Systems Biology
| Memorial Sloan-Kettering Cancer Center
| Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact
Hello,
I might have misunderstood something, but assuming such an array set
consists of k arrays, wouldn't it be the easiest to perform k
normalizations and analyses which give you k lists of p-values for k
(non-overlapping (I assume!?)) sets of genes.
To adjust for multiple testing you need to bind those k lists together
to one long vector of p-values and apply p.adjust or whatever function
is your favourite one.
This saves you from normalizing between arrays that have different
genes on them etc. and seems very easy to do.
Claus
> -----Original Message-----
> From: bioconductor-bounces at stat.math.ethz.ch
[mailto:bioconductor-
> bounces at stat.math.ethz.ch] On Behalf Of Tefina Paloma
> Sent: 11 September 2009 14:48
> To: bioconductor at stat.math.ethz.ch
> Subject: Re: [BioC] Array Set - Multiple Testing Problem
>
> To be able to fit the same model to all arrays, an additional
between-
> array
> normalization would be necessary, so to make all the arrays really
> comparable
> and I don't want to over-normalize the data either.....
>
> therefore I just thought of an sensible p value adjustment
>
> 2009/9/11 Sean Davis <seandavi at="" gmail.com="">
>
> >
> >
> > On Fri, Sep 11, 2009 at 8:58 AM, Tefina Paloma
> <tefina.paloma at="" gmail.com="">wrote:
> >
> >> Dear all
> >>
> >> unfortunately I did not get any reply on my post, so thats why I
am
> asking
> >> again,
> >> assuming that lots of people already came across that problem.
> >>
> >> Working with an array set ( cDNA or any single color platform)
just
> means
> >> that the probes you are interested in, are spread out over more
than
> one
> >> array
> >> (usually due to space limitations),
> >> So sample samples, but different features.
> >>
> >> But actually that kind of separation of the probes is rather
random.
> >> The question arises at which level of the analysis the arrays
should be
> >> aggregated.
> >>
> >> I think the normalization and also the model fitting should be
done
> >> separately.
> >>
> >> But as we do not only consider contrasts within each array of the
array
> >> set,
> >> but at the contrast,
> >> we want to look at the results of all arrays at the same time,
the
> >> p-values
> >> must be adjusted somehow for
> >> this array-effect.
> >>
> >> To do this in a "global" manner similar to the "global method" of
> >> decide.tests will probably result in being overly
> >> conservative.
> >>
> >> Any suggestions?
> >>
> >>
> > Why not just normalize each array in the set separately and then
combine
> > the normalized data for analysis? I'm not sure I see why the
arrays
> would
> > need to be treated independently for analysis, assuming the
technology
> was
> > the same for each array in the set.
> >
> > Sean
> >
> >
>
> [[alternative HTML version deleted]]
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives:
> http://news.gmane.org/gmane.science.biology.informatics.conductor
The University of Aberdeen is a charity registered in Scotland, No
SC013683.
Hello,
first of all, thanks for all the answers.
Unfortunately I do not have information about the exact probes-
behaviour.
The library was in-house selected and surely does not represent an
uniform-behaving set of genes (like in the example of the hgu133a and
b
chip)
In my special case we are talking of cDNA arrays, unfortunately the
quality
is not very consistent and the arrays behave very different.
So, I do have doubts about combining the data after normalization, the
quality is just not good enough.
Another issue is that in 2 arrays of the array set about a quarter of
the
spots are the same. But only in these 2 arrays.
I still have to sort out how to deal with this.
I think the "safest approach" would be adjusting all p values from all
lists
together,
I am curious if this will work..(so if this approach will leave me
with some
significant p values, (assuming that there is an effect), or if the
number
of tests will be just too large)
Best,
Tefina
2009/9/11 Mayer, Claus-Dieter <c.mayer@abdn.ac.uk>
> Hello,
>
> I might have misunderstood something, but assuming such an array set
> consists of k arrays, wouldn't it be the easiest to perform k
normalizations
> and analyses which give you k lists of p-values for k (non-
overlapping (I
> assume!?)) sets of genes.
>
> To adjust for multiple testing you need to bind those k lists
together to
> one long vector of p-values and apply p.adjust or whatever function
is your
> favourite one.
>
> This saves you from normalizing between arrays that have different
genes on
> them etc. and seems very easy to do.
>
> Claus
>
> > -----Original Message-----
> > From: bioconductor-bounces@stat.math.ethz.ch [mailto:bioconductor-
> > bounces@stat.math.ethz.ch] On Behalf Of Tefina Paloma
> > Sent: 11 September 2009 14:48
> > To: bioconductor@stat.math.ethz.ch
> > Subject: Re: [BioC] Array Set - Multiple Testing Problem
> >
> > To be able to fit the same model to all arrays, an additional
between-
> > array
> > normalization would be necessary, so to make all the arrays really
> > comparable
> > and I don't want to over-normalize the data either.....
> >
> > therefore I just thought of an sensible p value adjustment
> >
> > 2009/9/11 Sean Davis <seandavi@gmail.com>
> >
> > >
> > >
> > > On Fri, Sep 11, 2009 at 8:58 AM, Tefina Paloma
> > <tefina.paloma@gmail.com>wrote:
> > >
> > >> Dear all
> > >>
> > >> unfortunately I did not get any reply on my post, so thats why
I am
> > asking
> > >> again,
> > >> assuming that lots of people already came across that problem.
> > >>
> > >> Working with an array set ( cDNA or any single color platform)
just
> > means
> > >> that the probes you are interested in, are spread out over more
than
> > one
> > >> array
> > >> (usually due to space limitations),
> > >> So sample samples, but different features.
> > >>
> > >> But actually that kind of separation of the probes is rather
random.
> > >> The question arises at which level of the analysis the arrays
should
> be
> > >> aggregated.
> > >>
> > >> I think the normalization and also the model fitting should be
done
> > >> separately.
> > >>
> > >> But as we do not only consider contrasts within each array of
the
> array
> > >> set,
> > >> but at the contrast,
> > >> we want to look at the results of all arrays at the same time,
the
> > >> p-values
> > >> must be adjusted somehow for
> > >> this array-effect.
> > >>
> > >> To do this in a "global" manner similar to the "global method"
of
> > >> decide.tests will probably result in being overly
> > >> conservative.
> > >>
> > >> Any suggestions?
> > >>
> > >>
> > > Why not just normalize each array in the set separately and then
> combine
> > > the normalized data for analysis? I'm not sure I see why the
arrays
> > would
> > > need to be treated independently for analysis, assuming the
technology
> > was
> > > the same for each array in the set.
> > >
> > > Sean
> > >
> > >
> >
> > [[alternative HTML version deleted]]
> >
> > _______________________________________________
> > Bioconductor mailing list
> > Bioconductor@stat.math.ethz.ch
> > https://stat.ethz.ch/mailman/listinfo/bioconductor
> > Search the archives:
> > http://news.gmane.org/gmane.science.biology.informatics.conductor
>
>
> The University of Aberdeen is a charity registered in Scotland, No
> SC013683.
>
[[alternative HTML version deleted]]