Entering edit mode
So my understanding is that if there are no technical replicates one
could
do a single channel analysis in LIMMA using the duplicate correlation
command to indicate the 2 samples on the same array. This would be
equivalent to having a random effect for array - hence allowing the
simplicity of the single channel analysis with a statistically
appropriate
means of handling the within spot correlation.
--Naomi
At 09:48 PM 6/24/2005, Gordon K Smyth wrote:
>On Sat, June 25, 2005 12:36 am, Wolfgang Huber said:
> >> Basically, you're saying that if the arrays are very high
quality, you can
> >> get away with an inefficient analysis.
> >
> > Gordon, I did not say that, it sounds stupid, please do not
misquote
> > people.
>
>Actually I didn't quote you at all. The word "Basically" in this
context
>is a signal that I am
>interpreting your comments and their consequences rather than quoting
>you. You can disagree with
>my interpretation or can argue that it is mistaken, as you do below,
but
>being misquoted is quite
>a different thing! :)
>
> >> Naomi is refering to what I call the "intraspot" correlation, see
for
> >> example the intraspotCorrelation() function in the limma package,
and it
> >> is critically important. The correlation isn't a bad thing, nor
is it
> >> restricted to poor quality arrays. Rather it means that contrasts
> >> estimated within a spot are highly accurate.
> >
> > I agree that contrasts estimated from within one array are more
> > accurate than those from different arrays.
>
>And in order to combine these two types of contrasts efficiency in an
>analysis, one needs to
>quantify the difference in accuracy. Hence the need to estimate the
>intraspot correlation.
>
> > Note that when I said
> > "treat a two-color array like two single-color arrays", this was
in
> > the paragraph on how to normalize, not on differential expression.
But
> > apparently this still triggered off a few people ...
>
>Part of the trouble is that you continued on in the next paragraph to
>consider differential
>expression, and you seemed (to me at least) to be implying that the
same
>conclusions continue to
>apply with only one caveat. Thanks for the clarification.
>
>As you know, I personally prefer to take advantage of the two-colour
>technology even at the
>normalisation stage, but that's another matter.
>
> > Two aspects were raised by Claus' question that started this
thread:
> > how to normalize these data, and how to identify differentially
> > expressed genes. My experience is that multi-channel
normalization
> > methods like vsn (or quantiles for that matter) work well for sets
of
> > mass-produced two-color arrays. Then, it is still better to look
at
> > contrasts within arrays. But it is at least possible (even if less
> > accurate / precise) to look at contrasts across arrays by directly
> > comparing the intensities, rather than always having to go through
a
> > chain of log-ratios.
>
>Claus' asked what is specific to Agilent. As I understand it, all
your
>comments here apply to any
>type of two-colour array. Did you intend to say something specific
about
>Agilent arrays or am I
>still mis-understanding what you mean?
>
> >> Why not do it properly and get the full benefit of the high
> >> quality arrays? My experience is that high quality
> >> Agilent arrays can beat affy for accuracy if treated properly.
> >
> > Agreed. Do you think it's because of the two colors or of the
longer
> > (and hence more specific) probes ?
>
>Well, Affy actually has more nucleotides per gene than Agilent when
one
>takes into account the
>multiple probes per probe set. I don't want to speculate too much on
the
>reasons, but the fact
>that Agilent can reliably lay down 80mers rather than 25mers strongly
>suggests that the deposition
>process is more accurate. The two colours are certainly
>important. Calculations in our lab
>suggest that one typically loses around 70% of information in a two
colour
>experiment by going
>from direct to indirect comparisons, and 80-90% when going to single
>channel comparisons across
>different arrays without taking the intraspot correlations into
>account. So Agilent may be well
>behind Affy if not treated optimally.
>
>Gordon
>
> > Best wishes
> > Wolfgang
> >
> > <quote who="Gordon Smyth">
> >> Wolfgang,
> >>
> >> Naomi is refering to what I call the "intraspot" correlation, see
for
> >> example the intraspotCorrelation() function in the limma package,
and it
> >> is
> >> critically important. The correlation isn't a bad thing, nor is
it
> >> restricted to poor quality arrays. Rather it means that contrasts
> >> estimated
> >> within a spot are highly accurate. It is what makes the two-
colour
> >> technology intrinsically more accurate than one channel
technology, other
> >> things being equal. See
http://www.statsci.org/smyth/pubs/ISI2005-116.pdf
> >> for some discussion.
> >>
> >> Basically, you're saying that if the arrays are very high
quality, you can
> >> get away with an inefficient analysis. Why not do it properly and
get the
> >> full benefit of the high quality arrays? My experience is that
high
> >> quality
> >> Agilent arrays can beat affy for accuracy if treated properly.
> >>
> >> Gordon
> >>
> >>>Date: Thu, 23 Jun 2005 15:29:38 +0100 (BST)
> >>>From: "Wolfgang Huber" <huber at="" ebi.ac.uk="">
> >>>Subject: Re: [BioC] Agilent Arrays
> >>>To: "Naomi Altman" <naomi at="" stat.psu.edu="">
> >>>Cc: bioconductor at stat.math.ethz.ch
> >>>
> >>>Hi Naomi,
> >>>
> >>>and why is that important? Also, what is the within gene
correlation
> >>>between green foreground of array 1 and green foreground of array
2?
> >>>
> >>>Bw
> >>> Wolfgang
> >>>
> >>><quote who="Naomi Altman">
> >>> > I am working with Agilent arrays on which we have spotted many
> >>> replicates
> >>> > of the control spots.
> >>> > The within gene correlation between red and green forground is
about
> >>> 0.8
> >>> > for the unnormalized data - i.e. pretty high!
> >>> >
> >>> > --Naomi
> >>> >
> >>> > At 03:23 AM 6/23/2005, Wolfgang Huber wrote:
> >>> >>Hi Claus,
> >>> >>
> >>> >>for the normalization of arrays where the spotting etc.
variability
> >>> >>between chips is not strong, you can treat the data from m
two-colour
> >>> >>arrays as if it were 2*m single colour ones, and use methods
like
> >>> >>"quantiles" or "vsn".
> >>> >>
> >>> >>Note that for almost all genes, the hybridization is not
limited by
> >>> the
> >>> >>amount of probe DNA, hence the competition between red and
gree target
> >>> is
> >>> >>negligible for almost all genes (execept possibly the most
highly
> >>> >>expressed ones). This justifies treating a two-color array
like two
> >>> >>single-color arrays.
> >>> >>
> >>> >>Only later when you consider the contrasts of interest for
finding
> >>> >>differentially expressed genes, you want to make sure that
these are
> >>> not
> >>> >>confounded with dye.
> >>> >>
> >>> >>PS, I think your question is very directly Bioconductor
related!
> >>> >>
> >>> >>Best wishes
> >>> >> Wolfgang
> >>> >>
> >>> >>
> >>> >><quote who="Claus Mayer">
> >>> >> > Dear all!
> >>> >> >
> >>> >> > Apologies for asking a question which is not directly
Bioconductor
> >>> >> > related: After some experience with spotted 2-channel
arrays and
> >>> >> > Affydata, I am currently analysing my first data set based
on
> >>> Agilent
> >>> >> > arrays. I know that packages like marray or limma have
facilities
> >>> to
> >>> >> > read these data and that they can be normalised and
analysed like
> >>> any
> >>> >> > other 2-colour-arrays. On the other hand the printing
technology of
> >>> >> > these arrays (using inkjet-printing of 60mer oligos) is
closer in
> >>> >> spirit
> >>> >> > to Affy, if I understand this correctly. This seems to show
in the
> >>> >> data
> >>> >> > as well. For example the strongest correlations I found in
the
> >>> single
> >>> >> > channel (log-)intensities was not between the two channels
observed
> >>> on
> >>> >> > the same slide (like with spotted arrays), but between the
two
> >>> >> channels
> >>> >> > (differently dyed on different arrays in a loop design)
that
> >>> contained
> >>> >> > the same sample (which is quite reassuring). This made me
wonder
> >>> >> whether
> >>> >> > (once dye and array effects have been removed by some
normalisation
> >>> >> > method) with Agilent arrays one might really use single
channel
> >>> >> > intensities as measures of gene expression instead of
reducing them
> >>> to
> >>> >> > the log-ratio only as is usually done for two-channel data.
> >>> >> >
> >>> >> > This would have consequences on the way these arrays should
be
> >>> >> > normalised (rather by a multichip method than individually)
and
> >>> also
> >>> >> > allow more flexibility in the design of experiments.
> >>> >> >
> >>> >> > As I said before this is my first Agilent data set, so I
would be
> >>> >> > interested to hear opinions of others with more experience.
Before
> >>> I
> >>> >> > start to re-invent the wheel here, I?d be also interested
to know
> >>> >> > whether any of you is aware of tools, software, papers,
etc?
> >>> dealing
> >>> >> > with the analysis of Agilent array data specifically
(rather than
> >>> just
> >>> >> > applying standard methods for 2-coloured cDNA -arrays).
> >>> >> >
> >>> >> > Any help/comments appreciated
> >>> >> >
> >>> >> > Claus
> >>> >> >
> >>> >> > --
> >>> >> >
> >>> >>
> >>>
> ********************************************************************
***************
> >>> >> > Claus-D. Mayer |
http://www.bioss.ac.uk
> >>> >> > Biomathematics & Statistics Scotland | email: claus at
bioss.ac.uk
> >>> >> > Rowett Research Institute | Telephone: +44 (0)
1224
> >>> 716652
> >>> >> > Aberdeen AB21 9SB, Scotland, UK. | Fax: +44 (0) 1224
715349
> >>> >> >
> >>> >> > _______________________________________________
> >>> >> > Bioconductor mailing list
> >>> >> > Bioconductor at stat.math.ethz.ch
> >>> >> > https://stat.ethz.ch/mailman/listinfo/bioconductor
> >>> >> >
> >>> >> >
> >>> >>
> >>> >>
> >>> >>-------------------------------------
> >>> >>Wolfgang Huber
> >>> >>European Bioinformatics Institute
> >>> >>European Molecular Biology Laboratory
> >>> >>Cambridge CB10 1SD
> >>> >>England
> >>> >>Phone: +44 1223 494642
> >>> >>Http: www.ebi.ac.uk/huber
> >>> >>
> >>> >>_______________________________________________
> >>> >>Bioconductor mailing list
> >>> >>Bioconductor at stat.math.ethz.ch
> >>> >>https://stat.ethz.ch/mailman/listinfo/bioconductor
> >>> >
> >>> > Naomi S. Altman 814-865-3791
(voice)
> >>> > Associate Professor
> >>> > Bioinformatics Consulting Center
> >>> > Dept. of Statistics 814-863-7114
(fax)
> >>> > Penn State University 814-865-1348
> >>> (Statistics)
> >>> > University Park, PA 16802-2111
Naomi S. Altman 814-865-3791 (voice)
Associate Professor
Bioinformatics Consulting Center
Dept. of Statistics 814-863-7114 (fax)
Penn State University 814-865-1348
(Statistics)
University Park, PA 16802-2111