duplicate correlation on Agilent 4x44 arrays
1
0
Entering edit mode
@gordon-smyth
Last seen 4 minutes ago
WEHI, Melbourne, Australia
Dear Mitch, You don't say what instructions you are trying to follow here. I think you may be trying to use code which was intended for other data sets. I suspect that there may be more than one problem. Firstly, why do you need to use readGAL()? This is only needed with SPOT data. Your RG object from read.maimages() will already contain annotation information from the Agilent output files. Look at names(RG$genes) to see what you have. Secondly, does your GAL file match your data files? Type dim(RG) and gal <- readGAL() dim(gal) Do the row numbers agree? I am guessing they may have different numbers of rows. BTW, do you need to use "normexp"? I've found the AgilentFE background estimator is already pretty good, and doesn't produce negative intensities anyway. Best wishes Gordon >Date: Mon, 9 Apr 2007 12:21:57 +0200 >From: "Mitch Levesque" <mitch.levesque at="" tuebingen.mpg.de=""> >Subject: [BioC] duplicate correlation on Agilent 4x44 arrays >To: <bioconductor at="" stat.math.ethz.ch=""> > >Hi Bioconductors, > >I am using R 2.4.1 and limma to analyze the new Agilent 4x44 array design >and am having trouble with the duplicate correlation function using the >following script: > > >library(limma) >targets <- readTargets("Targets.txt") >RG <- read.maimages(targets$FileName, source="agilent") >RG$genes<-readGAL() >RG$printer<-getLayout(RG$genes) >RG <- backgroundCorrect(RG, method="normexp", offset=50) >MA <- normalizeWithinArrays(RG, method="loess") >MA <- MA[order(RG$genes[,"ID"]),] > >I get the following error: > >Error in `[.MAList`(MA, order(RG$genes[, "ID"]), ) : > subscript out of bounds > >I would like to treat the duplicate probes on each array as a technical >replicate, but since the spacing is not consistent for each gene, I must >first order the list by reference number. Are there any suggestions about >how I may do this? > >Mitch
limma limma • 892 views
ADD COMMENT
0
Entering edit mode
@mitch-levesque-2102
Last seen 9.7 years ago
Gordon, Thanks for the reply. I am not using any particular instruction set, just what I have put together from the User Guide. You were right about the file dimensions, they are different: > dim(RG) [1] 44407 4 > gal <- readGAL() > dim(gal) [1] 180880 10 Is it possible to read the duplicate positions directly off of the gal file? I tried: layout <- getLayout(gal, guessdups=TRUE) and I get the following: $ngrid.r [1] 1 $ngrid.c [1] 4 $nspot.r [1] 170 $nspot.c [1] 266 $ndups [1] 8 $spacing [1] NA attr(,"class") [1] "PrintLayout" I haven't tried without the normexp, but I will test it. Thanks again. Mitch -----Original Message----- From: Gordon Smyth [mailto:smyth@wehi.EDU.AU] Sent: Tuesday, April 10, 2007 1:03 PM To: Mitch Levesque Cc: bioconductor at stat.math.ethz.ch Subject: [BioC] duplicate correlation on Agilent 4x44 arrays Dear Mitch, You don't say what instructions you are trying to follow here. I think you may be trying to use code which was intended for other data sets. I suspect that there may be more than one problem. Firstly, why do you need to use readGAL()? This is only needed with SPOT data. Your RG object from read.maimages() will already contain annotation information from the Agilent output files. Look at names(RG$genes) to see what you have. Secondly, does your GAL file match your data files? Type dim(RG) and gal <- readGAL() dim(gal) Do the row numbers agree? I am guessing they may have different numbers of rows. BTW, do you need to use "normexp"? I've found the AgilentFE background estimator is already pretty good, and doesn't produce negative intensities anyway. Best wishes Gordon >Date: Mon, 9 Apr 2007 12:21:57 +0200 >From: "Mitch Levesque" <mitch.levesque at="" tuebingen.mpg.de=""> >Subject: [BioC] duplicate correlation on Agilent 4x44 arrays >To: <bioconductor at="" stat.math.ethz.ch=""> > >Hi Bioconductors, > >I am using R 2.4.1 and limma to analyze the new Agilent 4x44 array design >and am having trouble with the duplicate correlation function using the >following script: > > >library(limma) >targets <- readTargets("Targets.txt") >RG <- read.maimages(targets$FileName, source="agilent") >RG$genes<-readGAL() >RG$printer<-getLayout(RG$genes) >RG <- backgroundCorrect(RG, method="normexp", offset=50) >MA <- normalizeWithinArrays(RG, method="loess") >MA <- MA[order(RG$genes[,"ID"]),] > >I get the following error: > >Error in `[.MAList`(MA, order(RG$genes[, "ID"]), ) : > subscript out of bounds > >I would like to treat the duplicate probes on each array as a technical >replicate, but since the spacing is not consistent for each gene, I must >first order the list by reference number. Are there any suggestions about >how I may do this? > >Mitch
ADD COMMENT
0
Entering edit mode
On Tuesday 10 April 2007 08:07, Mitch Levesque wrote: > Gordon, > > Thanks for the reply. I am not using any particular instruction set, just > what I have put together from the User Guide. > > You were right about the file dimensions, they are different: > > dim(RG) > > [1] 44407 4 > > > gal <- readGAL() > > dim(gal) > > [1] 180880 10 > > Is it possible to read the duplicate positions directly off of the gal > file? I tried: If you are thinking that the four different arrays represent "duplicates", then that probably isn't correct. The "duplicates" in the sense of dupCorrelation are duplicate spots with the same sample hybridized to them; hybing the same sample multiple times on the same slide is not the typical use case (but perhaps you did do this?) There are not many duplicate spots on Agilent arrays unless you have an array design where this is the case. I don't recall what you said about your array design, but unless there are duplicates of many thousands of probes out of the total of 44k probes within one array, using dupCorrelation is probably not warranted. > layout <- getLayout(gal, guessdups=TRUE) The confusion here, I think, is in the fact that the GAL file is for the entire slide (which includes 4 arrays). You need to not use the GAL file for these arrays and just get the information from the Agilent FE file, which read.maimages will load automatically with source='agilent'. If there are other columns that you need, you can specify them directly from the read.maimages() function--see the documentation. Also, note that Agilent uses so-called orange-packed array designs, so the old idea of row/column doesn't translate perfectly, as each row is offset from the next. Also, within a given array (and on the 4x44, there are four such arrays), there are no subarrays. > I haven't tried without the normexp, but I will test it. Thanks again. Agilent uses a rather sophisticated background estimation method, so I agree with Gordon that there really isn't a need do more for these arrays. You can read the technical manual for the platform for a full description of the algorithm (which I would encourage). Sean
ADD REPLY

Login before adding your answer.

Traffic: 333 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6