Agilent Arrays
4
0
Entering edit mode
Claus Mayer ▴ 340
@claus-mayer-1179
Last seen 10.2 years ago
European Union
Dear all! Apologies for asking a question which is not directly Bioconductor related: After some experience with spotted 2-channel arrays and Affydata, I am currently analysing my first data set based on Agilent arrays. I know that packages like marray or limma have facilities to read these data and that they can be normalised and analysed like any other 2-colour-arrays. On the other hand the printing technology of these arrays (using inkjet-printing of 60mer oligos) is closer in spirit to Affy, if I understand this correctly. This seems to show in the data as well. For example the strongest correlations I found in the single channel (log-)intensities was not between the two channels observed on the same slide (like with spotted arrays), but between the two channels (differently dyed on different arrays in a loop design) that contained the same sample (which is quite reassuring). This made me wonder whether (once dye and array effects have been removed by some normalisation method) with Agilent arrays one might really use single channel intensities as measures of gene expression instead of reducing them to the log-ratio only as is usually done for two-channel data. This would have consequences on the way these arrays should be normalised (rather by a multichip method than individually) and also allow more flexibility in the design of experiments. As I said before this is my first Agilent data set, so I would be interested to hear opinions of others with more experience. Before I start to re-invent the wheel here, I?d be also interested to know whether any of you is aware of tools, software, papers, etc? dealing with the analysis of Agilent array data specifically (rather than just applying standard methods for 2-coloured cDNA -arrays). Any help/comments appreciated Claus -- ********************************************************************** ************* Claus-D. Mayer | http://www.bioss.ac.uk Biomathematics & Statistics Scotland | email: claus at bioss.ac.uk Rowett Research Institute | Telephone: +44 (0) 1224 716652 Aberdeen AB21 9SB, Scotland, UK. | Fax: +44 (0) 1224 715349
affy limma marray affy limma marray • 1.8k views
ADD COMMENT
0
Entering edit mode
Naomi Altman ★ 6.0k
@naomi-altman-380
Last seen 3.6 years ago
United States
I am working with Agilent arrays on which we have spotted many replicates of the control spots. The within gene correlation between red and green forground is about 0.8 for the unnormalized data - i.e. pretty high! --Naomi At 03:23 AM 6/23/2005, Wolfgang Huber wrote: >Hi Claus, > >for the normalization of arrays where the spotting etc. variability >between chips is not strong, you can treat the data from m two-colour >arrays as if it were 2*m single colour ones, and use methods like >"quantiles" or "vsn". > >Note that for almost all genes, the hybridization is not limited by the >amount of probe DNA, hence the competition between red and gree target is >negligible for almost all genes (execept possibly the most highly >expressed ones). This justifies treating a two-color array like two >single-color arrays. > >Only later when you consider the contrasts of interest for finding >differentially expressed genes, you want to make sure that these are not >confounded with dye. > >PS, I think your question is very directly Bioconductor related! > >Best wishes > Wolfgang > > ><quote who="Claus Mayer"> > > Dear all! > > > > Apologies for asking a question which is not directly Bioconductor > > related: After some experience with spotted 2-channel arrays and > > Affydata, I am currently analysing my first data set based on Agilent > > arrays. I know that packages like marray or limma have facilities to > > read these data and that they can be normalised and analysed like any > > other 2-colour-arrays. On the other hand the printing technology of > > these arrays (using inkjet-printing of 60mer oligos) is closer in spirit > > to Affy, if I understand this correctly. This seems to show in the data > > as well. For example the strongest correlations I found in the single > > channel (log-)intensities was not between the two channels observed on > > the same slide (like with spotted arrays), but between the two channels > > (differently dyed on different arrays in a loop design) that contained > > the same sample (which is quite reassuring). This made me wonder whether > > (once dye and array effects have been removed by some normalisation > > method) with Agilent arrays one might really use single channel > > intensities as measures of gene expression instead of reducing them to > > the log-ratio only as is usually done for two-channel data. > > > > This would have consequences on the way these arrays should be > > normalised (rather by a multichip method than individually) and also > > allow more flexibility in the design of experiments. > > > > As I said before this is my first Agilent data set, so I would be > > interested to hear opinions of others with more experience. Before I > > start to re-invent the wheel here, I?d be also interested to know > > whether any of you is aware of tools, software, papers, etc dealing > > with the analysis of Agilent array data specifically (rather than just > > applying standard methods for 2-coloured cDNA -arrays). > > > > Any help/comments appreciated > > > > Claus > > > > -- > > > ******************************************************************** *************** > > Claus-D. Mayer | http://www.bioss.ac.uk > > Biomathematics & Statistics Scotland | email: claus at bioss.ac.uk > > Rowett Research Institute | Telephone: +44 (0) 1224 716652 > > Aberdeen AB21 9SB, Scotland, UK. | Fax: +44 (0) 1224 715349 > > > > _______________________________________________ > > Bioconductor mailing list > > Bioconductor at stat.math.ethz.ch > > https://stat.ethz.ch/mailman/listinfo/bioconductor > > > > > > >------------------------------------- >Wolfgang Huber >European Bioinformatics Institute >European Molecular Biology Laboratory >Cambridge CB10 1SD >England >Phone: +44 1223 494642 >Http: www.ebi.ac.uk/huber > >_______________________________________________ >Bioconductor mailing list >Bioconductor at stat.math.ethz.ch >https://stat.ethz.ch/mailman/listinfo/bioconductor Naomi S. Altman 814-865-3791 (voice) Associate Professor Bioinformatics Consulting Center Dept. of Statistics 814-863-7114 (fax) Penn State University 814-865-1348 (Statistics) University Park, PA 16802-2111
ADD COMMENT
0
Entering edit mode
Hi Naomi, and why is that important? Also, what is the within gene correlation between green foreground of array 1 and green foreground of array 2? Bw Wolfgang <quote who="Naomi Altman"> > I am working with Agilent arrays on which we have spotted many replicates > of the control spots. > The within gene correlation between red and green forground is about 0.8 > for the unnormalized data - i.e. pretty high! > > --Naomi > > At 03:23 AM 6/23/2005, Wolfgang Huber wrote: >>Hi Claus, >> >>for the normalization of arrays where the spotting etc. variability >>between chips is not strong, you can treat the data from m two- colour >>arrays as if it were 2*m single colour ones, and use methods like >>"quantiles" or "vsn". >> >>Note that for almost all genes, the hybridization is not limited by the >>amount of probe DNA, hence the competition between red and gree target is >>negligible for almost all genes (execept possibly the most highly >>expressed ones). This justifies treating a two-color array like two >>single-color arrays. >> >>Only later when you consider the contrasts of interest for finding >>differentially expressed genes, you want to make sure that these are not >>confounded with dye. >> >>PS, I think your question is very directly Bioconductor related! >> >>Best wishes >> Wolfgang >> >> >><quote who="Claus Mayer"> >> > Dear all! >> > >> > Apologies for asking a question which is not directly Bioconductor >> > related: After some experience with spotted 2-channel arrays and >> > Affydata, I am currently analysing my first data set based on Agilent >> > arrays. I know that packages like marray or limma have facilities to >> > read these data and that they can be normalised and analysed like any >> > other 2-colour-arrays. On the other hand the printing technology of >> > these arrays (using inkjet-printing of 60mer oligos) is closer in >> spirit >> > to Affy, if I understand this correctly. This seems to show in the >> data >> > as well. For example the strongest correlations I found in the single >> > channel (log-)intensities was not between the two channels observed on >> > the same slide (like with spotted arrays), but between the two >> channels >> > (differently dyed on different arrays in a loop design) that contained >> > the same sample (which is quite reassuring). This made me wonder >> whether >> > (once dye and array effects have been removed by some normalisation >> > method) with Agilent arrays one might really use single channel >> > intensities as measures of gene expression instead of reducing them to >> > the log-ratio only as is usually done for two-channel data. >> > >> > This would have consequences on the way these arrays should be >> > normalised (rather by a multichip method than individually) and also >> > allow more flexibility in the design of experiments. >> > >> > As I said before this is my first Agilent data set, so I would be >> > interested to hear opinions of others with more experience. Before I >> > start to re-invent the wheel here, I?d be also interested to know >> > whether any of you is aware of tools, software, papers, etc dealing >> > with the analysis of Agilent array data specifically (rather than just >> > applying standard methods for 2-coloured cDNA -arrays). >> > >> > Any help/comments appreciated >> > >> > Claus >> > >> > -- >> > >> ******************************************************************* **************** >> > Claus-D. Mayer | http://www.bioss.ac.uk >> > Biomathematics & Statistics Scotland | email: claus at bioss.ac.uk >> > Rowett Research Institute | Telephone: +44 (0) 1224 716652 >> > Aberdeen AB21 9SB, Scotland, UK. | Fax: +44 (0) 1224 715349 >> > >> > _______________________________________________ >> > Bioconductor mailing list >> > Bioconductor at stat.math.ethz.ch >> > https://stat.ethz.ch/mailman/listinfo/bioconductor >> > >> > >> >> >>------------------------------------- >>Wolfgang Huber >>European Bioinformatics Institute >>European Molecular Biology Laboratory >>Cambridge CB10 1SD >>England >>Phone: +44 1223 494642 >>Http: www.ebi.ac.uk/huber >> >>_______________________________________________ >>Bioconductor mailing list >>Bioconductor at stat.math.ethz.ch >>https://stat.ethz.ch/mailman/listinfo/bioconductor > > Naomi S. Altman 814-865-3791 (voice) > Associate Professor > Bioinformatics Consulting Center > Dept. of Statistics 814-863-7114 (fax) > Penn State University 814-865-1348 (Statistics) > University Park, PA 16802-2111 > > > ------------------------------------- Wolfgang Huber European Bioinformatics Institute European Molecular Biology Laboratory Cambridge CB10 1SD England Phone: +44 1223 494642 Http: www.ebi.ac.uk/huber
ADD REPLY
0
Entering edit mode
@wolfgang-huber-3550
Last seen 3 months ago
EMBL European Molecular Biology Laborat…
Hi Claus, for the normalization of arrays where the spotting etc. variability between chips is not strong, you can treat the data from m two-colour arrays as if it were 2*m single colour ones, and use methods like "quantiles" or "vsn". Note that for almost all genes, the hybridization is not limited by the amount of probe DNA, hence the competition between red and gree target is negligible for almost all genes (execept possibly the most highly expressed ones). This justifies treating a two-color array like two single-color arrays. Only later when you consider the contrasts of interest for finding differentially expressed genes, you want to make sure that these are not confounded with dye. PS, I think your question is very directly Bioconductor related! Best wishes Wolfgang <quote who="Claus Mayer"> > Dear all! > > Apologies for asking a question which is not directly Bioconductor > related: After some experience with spotted 2-channel arrays and > Affydata, I am currently analysing my first data set based on Agilent > arrays. I know that packages like marray or limma have facilities to > read these data and that they can be normalised and analysed like any > other 2-colour-arrays. On the other hand the printing technology of > these arrays (using inkjet-printing of 60mer oligos) is closer in spirit > to Affy, if I understand this correctly. This seems to show in the data > as well. For example the strongest correlations I found in the single > channel (log-)intensities was not between the two channels observed on > the same slide (like with spotted arrays), but between the two channels > (differently dyed on different arrays in a loop design) that contained > the same sample (which is quite reassuring). This made me wonder whether > (once dye and array effects have been removed by some normalisation > method) with Agilent arrays one might really use single channel > intensities as measures of gene expression instead of reducing them to > the log-ratio only as is usually done for two-channel data. > > This would have consequences on the way these arrays should be > normalised (rather by a multichip method than individually) and also > allow more flexibility in the design of experiments. > > As I said before this is my first Agilent data set, so I would be > interested to hear opinions of others with more experience. Before I > start to re-invent the wheel here, I?d be also interested to know > whether any of you is aware of tools, software, papers, etc dealing > with the analysis of Agilent array data specifically (rather than just > applying standard methods for 2-coloured cDNA -arrays). > > Any help/comments appreciated > > Claus > > -- > ******************************************************************** *************** > Claus-D. Mayer | http://www.bioss.ac.uk > Biomathematics & Statistics Scotland | email: claus at bioss.ac.uk > Rowett Research Institute | Telephone: +44 (0) 1224 716652 > Aberdeen AB21 9SB, Scotland, UK. | Fax: +44 (0) 1224 715349 > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > > ------------------------------------- Wolfgang Huber European Bioinformatics Institute European Molecular Biology Laboratory Cambridge CB10 1SD England Phone: +44 1223 494642 Http: www.ebi.ac.uk/huber
ADD COMMENT
0
Entering edit mode
@gordon-smyth
Last seen 1 hour ago
WEHI, Melbourne, Australia
Sorry, my last email should have had the above subject heading. Gordon
ADD COMMENT
0
Entering edit mode
@michael_kirkwmiusydeduau-1045
Last seen 10.2 years ago
While I agree that it is probably a bad idea to use single channel analysis on two colour arrays, some of the arguments presented here are a little troubling. The observation that the intra slide correlation is 0.8 doesn't, to my mind, show anything unless it is high relative to inter slide correlation. Regardless of what treatments are applied to the samples, all mouse (say) samples would be expected to have roughly similar (array wise) expression profiles. This is partly a reflection of the fact that many genes may not vary between treatments and different probes will have different hybridization efficiencies (i.e. some spots will always have low intensities and some high). Secondly, IF the single channel intensities were in fact highly accurate, then it is the two colour analysis that would be inefficient (in terms of number of arrays required). The two colour idea is essentially to overcome noise, particularly noise due to variation in the printed spots between slides (i.e. the chemical/physical properties of a spot for a given gene may vary between slides). In this case the variation is assumed to affect each hybridized sample similarly (multiplicatively) and by taking the ratio this variation is removed. A fine idea, but it does leave us with less information than if the slide quality was sufficient for this to to be unnecessary. >From the two colour analysis of a single slide we have a set of ratios, which may then be compared between slides. From the single channel analysis of a two colour hybridization we have two sets of measurements, which also may be compared between slides. With two colour analysis, only three samples can be compared using two slides, whereas if the single channel analysis was justified (and note I am not say it is, only discussing the arguments given against it), then four samples can be compared. Michael > Wolfgang, > > Naomi is refering to what I call the "intraspot" correlation, see for > example the intraspotCorrelation() function in the limma package, and it is > critically important. The correlation isn't a bad thing, nor is it > restricted to poor quality arrays. Rather it means that contrasts estimated > within a spot are highly accurate. It is what makes the two-colour > technology intrinsically more accurate than one channel technology, other > things being equal. See http://www.statsci.org/smyth/pubs/ISI2005-116.pdf > for some discussion. > > Basically, you're saying that if the arrays are very high quality, you can > get away with an inefficient analysis. Why not do it properly and get the > full benefit of the high quality arrays? My experience is that high quality > Agilent arrays can beat affy for accuracy if treated properly. > > Gordon > > >Date: Thu, 23 Jun 2005 15:29:38 +0100 (BST) > >From: "Wolfgang Huber" <huber at="" ebi.ac.uk=""> > >Subject: Re: [BioC] Agilent Arrays > >To: "Naomi Altman" <naomi at="" stat.psu.edu=""> > >Cc: bioconductor at stat.math.ethz.ch > > > >Hi Naomi, > > > >and why is that important? Also, what is the within gene correlation > >between green foreground of array 1 and green foreground of array 2? > > > >Bw > > Wolfgang > > > ><quote who="Naomi Altman"> > > > I am working with Agilent arrays on which we have spotted many replicates > > > of the control spots. > > > The within gene correlation between red and green forground is about 0.8 > > > for the unnormalized data - i.e. pretty high! > > > > > > --Naomi [snip]
0
Entering edit mode
Kirk, You have missed my point about the 50 control spots. These are all the SAME oligo. The correlation here is induced by the array, not by the RNA concentration, hybridization efficiency, etc. The reason that the two color analysis is supposed to be more efficient than 1 channel is the positive correlation between the errors for the 2 channels on the same spot. If the channels are uncorrelated, then there is no spot effect and using the differences is no better than using the 2 channels. The single channel analysis can be used, providing that you use a linear mixed model that includes a random effect for array (and, in the case of multiple spots per gene) for spot(array). --Naomi At 09:20 PM 6/26/2005, Michael Kirk wrote: >While I agree that it is probably a bad idea to use single channel >analysis on two colour arrays, some of the arguments presented here >are a little troubling. > >The observation that the intra slide correlation is 0.8 doesn't, to my >mind, show anything unless it is high relative to inter slide >correlation. Regardless of what treatments are applied to the samples, >all mouse (say) samples would be expected to have roughly similar >(array wise) expression profiles. This is partly a reflection of the >fact that many genes may not vary between treatments and different >probes will have different hybridization efficiencies (i.e. some spots >will always have low intensities and some high). > >Secondly, IF the single channel intensities were in fact highly >accurate, then it is the two colour analysis that would be inefficient >(in terms of number of arrays required). The two colour idea is >essentially to overcome noise, particularly noise due to variation in >the printed spots between slides (i.e. the chemical/physical >properties of a spot for a given gene may vary between slides). In >this case the variation is assumed to affect each hybridized sample >similarly (multiplicatively) and by taking the ratio this variation is >removed. A fine idea, but it does leave us with less information than >if the slide quality was sufficient for this to to be unnecessary. > >From the two colour analysis of a single slide we have a set of >ratios, which may then be compared between slides. From the single >channel analysis of a two colour hybridization we have two sets of >measurements, which also may be compared between slides. > >With two colour analysis, only three samples can be compared using two >slides, whereas if the single channel analysis was justified (and note >I am not say it is, only discussing the arguments given against it), >then four samples can be compared. > >Michael > > > Wolfgang, > > > > Naomi is refering to what I call the "intraspot" correlation, see for > > example the intraspotCorrelation() function in the limma package, and > it is > > critically important. The correlation isn't a bad thing, nor is it > > restricted to poor quality arrays. Rather it means that contrasts > estimated > > within a spot are highly accurate. It is what makes the two-colour > > technology intrinsically more accurate than one channel technology, other > > things being equal. See http://www.statsci.org/smyth/pubs/ISI2005-116.pdf > > for some discussion. > > > > Basically, you're saying that if the arrays are very high quality, you can > > get away with an inefficient analysis. Why not do it properly and get the > > full benefit of the high quality arrays? My experience is that high > quality > > Agilent arrays can beat affy for accuracy if treated properly. > > > > Gordon > > > > >Date: Thu, 23 Jun 2005 15:29:38 +0100 (BST) > > >From: "Wolfgang Huber" <huber at="" ebi.ac.uk=""> > > >Subject: Re: [BioC] Agilent Arrays > > >To: "Naomi Altman" <naomi at="" stat.psu.edu=""> > > >Cc: bioconductor at stat.math.ethz.ch > > > > > >Hi Naomi, > > > > > >and why is that important? Also, what is the within gene correlation > > >between green foreground of array 1 and green foreground of array 2? > > > > > >Bw > > > Wolfgang > > > > > ><quote who="Naomi Altman"> > > > > I am working with Agilent arrays on which we have spotted many > replicates > > > > of the control spots. > > > > The within gene correlation between red and green forground is > about 0.8 > > > > for the unnormalized data - i.e. pretty high! > > > > > > > > --Naomi > >[snip] > >_______________________________________________ >Bioconductor mailing list >Bioconductor at stat.math.ethz.ch >https://stat.ethz.ch/mailman/listinfo/bioconductor Naomi S. Altman 814-865-3791 (voice) Associate Professor Bioinformatics Consulting Center Dept. of Statistics 814-863-7114 (fax) Penn State University 814-865-1348 (Statistics) University Park, PA 16802-2111
ADD REPLY
0
Entering edit mode
Quoting Naomi Altman <naomi at="" stat.psu.edu="">: > Kirk, > You have missed my point about the 50 control spots. These are all the > SAME oligo. > The correlation here is induced by the array, not by the RNA concentration, > hybridization efficiency, etc. OK, if it's the same oligo then that does indicate a spot, or hybridization effect. Not suprising I guess. Out of curiosity - did you quantify the size of the effect? I.e. was the inter replicate within slide variation of significant size relative to whatever variation may be expected between treatments? Michael > > The reason that the two color analysis is supposed to be more efficient > than 1 channel is the positive correlation between the > errors for the 2 channels on the same spot. If the channels are > uncorrelated, then there is no spot effect and using the differences is no > better than using the 2 channels. > > The single channel analysis can be used, providing that you use a linear > mixed model that includes a random effect for array (and, in the case of > multiple spots per gene) for > spot(array). > > --Naomi > > > At 09:20 PM 6/26/2005, Michael Kirk wrote: > >While I agree that it is probably a bad idea to use single channel > >analysis on two colour arrays, some of the arguments presented here > >are a little troubling. > > > >The observation that the intra slide correlation is 0.8 doesn't, to my > >mind, show anything unless it is high relative to inter slide > >correlation. Regardless of what treatments are applied to the samples, > >all mouse (say) samples would be expected to have roughly similar > >(array wise) expression profiles. This is partly a reflection of the > >fact that many genes may not vary between treatments and different > >probes will have different hybridization efficiencies (i.e. some spots > >will always have low intensities and some high). > > > >Secondly, IF the single channel intensities were in fact highly > >accurate, then it is the two colour analysis that would be inefficient > >(in terms of number of arrays required). The two colour idea is > >essentially to overcome noise, particularly noise due to variation in > >the printed spots between slides (i.e. the chemical/physical > >properties of a spot for a given gene may vary between slides). In > >this case the variation is assumed to affect each hybridized sample > >similarly (multiplicatively) and by taking the ratio this variation is > >removed. A fine idea, but it does leave us with less information than > >if the slide quality was sufficient for this to to be unnecessary. > > >From the two colour analysis of a single slide we have a set of > >ratios, which may then be compared between slides. From the single > >channel analysis of a two colour hybridization we have two sets of > >measurements, which also may be compared between slides. > > > >With two colour analysis, only three samples can be compared using two > >slides, whereas if the single channel analysis was justified (and note > >I am not say it is, only discussing the arguments given against it), > >then four samples can be compared. > > > >Michael > > > > > Wolfgang, > > > > > > Naomi is refering to what I call the "intraspot" correlation, see for > > > example the intraspotCorrelation() function in the limma package, and > > it is > > > critically important. The correlation isn't a bad thing, nor is it > > > restricted to poor quality arrays. Rather it means that contrasts > > estimated > > > within a spot are highly accurate. It is what makes the two- colour > > > technology intrinsically more accurate than one channel technology, other > > > things being equal. See http://www.statsci.org/smyth/pubs/ISI2005-116.pdf > > > for some discussion. > > > > > > Basically, you're saying that if the arrays are very high quality, you can > > > get away with an inefficient analysis. Why not do it properly and get the > > > full benefit of the high quality arrays? My experience is that high > > quality > > > Agilent arrays can beat affy for accuracy if treated properly. > > > > > > Gordon > > > > > > >Date: Thu, 23 Jun 2005 15:29:38 +0100 (BST) > > > >From: "Wolfgang Huber" <huber at="" ebi.ac.uk=""> > > > >Subject: Re: [BioC] Agilent Arrays > > > >To: "Naomi Altman" <naomi at="" stat.psu.edu=""> > > > >Cc: bioconductor at stat.math.ethz.ch > > > > > > > >Hi Naomi, > > > > > > > >and why is that important? Also, what is the within gene correlation > > > >between green foreground of array 1 and green foreground of array 2? > > > > > > > >Bw > > > > Wolfgang > > > > > > > ><quote who="Naomi Altman"> > > > > > I am working with Agilent arrays on which we have spotted many > > replicates > > > > > of the control spots. > > > > > The within gene correlation between red and green forground is > > about 0.8 > > > > > for the unnormalized data - i.e. pretty high! > > > > > > > > > > --Naomi > > > >[snip] > > > >_______________________________________________ > >Bioconductor mailing list > >Bioconductor at stat.math.ethz.ch > >https://stat.ethz.ch/mailman/listinfo/bioconductor > > Naomi S. Altman 814-865-3791 (voice) > Associate Professor > Bioinformatics Consulting Center > Dept. of Statistics 814-863-7114 (fax) > Penn State University 814-865-1348 (Statistics) > University Park, PA 16802-2111 > >

Login before adding your answer.

Traffic: 428 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6