Anyone Know how to make a fake CEL file?

0

Entering edit mode

Park, Richard ▴ 220

@park-richard-227

Last seen 11.4 years ago

Hi Johannes, I am going to try and use real values from experimental cel files to make various fake cel files w/ varying percentages of genes changing over time; i.e. having 20% of the genes change w/ a 2-5 fc by the third time point. The point of all of this is to try and understand what happens during normalization. With data analyses, anywhere between 2-30, normalizing seems to be fine, but when you start incorporating very different experimental conditions in very large data groups (60+), normalizing seems to minimize the differences between the conditions. We would like to create an analysis of 100+ chips of real data to understand various cell types, but with chips that are very different and that many different chips, normalization seems to severly limit the ability to see the differences between the conditions. That is why it would be nice to see the effects of rma (normalization and background correcting) and a comparison with MAS 5.0 values with spline normalization on a large set of "fake" cel data. > i2xy <- function(i) cbind((i-1) %% 640, (i-1) %/% 640) #this is for HGU-95Av2 chips Does anyone know the corresponding i2xy function that would be needed for the mu72av2 chip? I would appreciate any feedback from the bioconductor community. I haven't found anything on the internet or literature that addresses this problem. thanks everyone, Richard Park -----Original Message----- From: Johannes Freudenberg [mailto:mai98ftu@studserv.uni-leipzig.de] Sent: Tuesday, October 14, 2003 16:6 PM To: Park, Richard Subject: Re: [BioC] Anyone Know how to make a fake CEL file? Hi, > how do you know the corresponding x and y locations on > the chip that correspond with the various affy ids? The information on the probe locations is stored in the cdf environments and can be accessed as follows: > env <- getCdfInfo(Dilution) #get the CDF environment > > #get the probe locations > loc <- apply(matrix(ls(env = env)), 1, get, env = env) > > #That's how it's done in S-Plus > #loc <- getCdfInfo(Dilution) > > loc[[1]] # show the probe locations of the first gene pm mm [1,] 175218 175858 [2,] 356689 357329 [3,] 227696 228336 [4,] 237919 238559 ... These indices refer to the rows of the intensity matrix which is stored in the @exprs slot of the affybatch object. In order to get the corresponding x and y coordinates you can use the i2xy() function: > i2xy <- function(i) cbind((i-1) %% 640, (i-1) %/% 640) #this is for HGU-95Av2 chips #corrected version, older BioC version incorrect! #search BioC mailing list archive for more details Out of curiosity, may I ask how you are going to 'fake' the different treatments? Are you using real data or simulated data? Best wishes, Johannes Quoting "Park, Richard" <richard.park@joslin.harvard.edu>: > I am trying to make a couple of fake cel files to represent a time > course treatment between three time points. > I am trying to test the effects of normalization on various possible > treatments. > Is there a way to make a fake CEL file? > and if there is, how do you know the corresponding x and y locations on > the chip that correspond with the various affy ids? I know that this > information is located in the various cdf files, but I am unaware of how > to access that information. > > Thanks for any help, > > > Richard Park > Immunology - Computational Data Analyzer > Joslin Diabetes Center > Ph: 617-732-2482 > Richard.Park@joslin.harvard.edu > > _______________________________________________ > Bioconductor mailing list > Bioconductor@stat.math.ethz.ch > https://www.stat.math.ethz.ch/mailman/listinfo/bioconductor >

Normalization cdf probe affy Normalization cdf probe affy • 1.3k views

ADD COMMENT • link updated 22.3 years ago by mai98ftu@studserv.uni-leipzig.de ▴ 140 • written 22.3 years ago by Park, Richard ▴ 220

0

Entering edit mode

mai98ftu@studserv.uni-leipzig.de ▴ 140

@mai98ftustudservuni-leipzigde-338

Last seen 11.4 years ago

Hi Richard, > Does anyone know the corresponding i2xy function that would be needed > for the mu72av2 chip? as far as I know, the i2xy() function is included in the CDF packages that you can download from the BioConductor website (click on MetaData). I didn't find the respective package for the mu72av2 chip though. I'm assuming it's something like i2xy <- function(i) cbind((i-1) %% nr, (i-1) %/% nc) where nr is the number of rows and nc the number of columns of the chip which is in the case of hgu95av2: > Dilution@nrow [1] 640 > Dilution@ncol [1] 640 I hope that helps, Johannes Quoting "Park, Richard" <richard.park@joslin.harvard.edu>: > Hi Johannes, > I am going to try and use real values from experimental cel files to > make various fake cel files w/ varying percentages of genes changing > over time; i.e. having 20% of the genes change w/ a 2-5 fc by the third > time point. The point of all of this is to try and understand what > happens during normalization. With data analyses, anywhere between 2-30, > normalizing seems to be fine, but when you start incorporating very > different experimental conditions in very large data groups (60+), > normalizing seems to minimize the differences between the conditions. > > We would like to create an analysis of 100+ chips of real data to > understand various cell types, but with chips that are very different > and that many different chips, normalization seems to severly limit the > ability to see the differences between the conditions. That is why it > would be nice to see the effects of rma (normalization and background > correcting) and a comparison with MAS 5.0 values with spline > normalization on a large set of "fake" cel data. > > > i2xy <- function(i) cbind((i-1) %% 640, (i-1) %/% 640) > #this is for HGU-95Av2 chips > Does anyone know the corresponding i2xy function that would be needed > for the mu72av2 chip? > > I would appreciate any feedback from the bioconductor community. I > haven't found anything on the internet or literature that addresses > this problem. > > thanks everyone, > Richard Park > > > > -----Original Message----- > From: Johannes Freudenberg [mailto:mai98ftu@studserv.uni-leipzig.de] > Sent: Tuesday, October 14, 2003 16:6 PM > To: Park, Richard > Subject: Re: [BioC] Anyone Know how to make a fake CEL file? > > > Hi, > > > how do you know the corresponding x and y locations on > > the chip that correspond with the various affy ids? > > The information on the probe locations is stored in the cdf environments > and > can be accessed as follows: > > > env <- getCdfInfo(Dilution) #get the CDF environment > > > > #get the probe locations > > loc <- apply(matrix(ls(env = env)), 1, get, env = env) > > > > #That's how it's done in S-Plus > > #loc <- getCdfInfo(Dilution) > > > > loc[[1]] # show the probe locations of the first gene > pm mm > [1,] 175218 175858 > [2,] 356689 357329 > [3,] 227696 228336 > [4,] 237919 238559 > ... > > These indices refer to the rows of the intensity matrix which is stored > in the > @exprs slot of the affybatch object. In order to get the corresponding > x and y > coordinates you can use the i2xy() function: > > > i2xy <- function(i) cbind((i-1) %% 640, (i-1) %/% 640) > #this is for HGU-95Av2 chips > #corrected version, older BioC version incorrect! > #search BioC mailing list archive for more details > > Out of curiosity, may I ask how you are going to 'fake' the different > treatments? Are you using real data or simulated data? > > Best wishes, > Johannes > > > > Quoting "Park, Richard" <richard.park@joslin.harvard.edu>: > > > I am trying to make a couple of fake cel files to represent a time > > course treatment between three time points. > > I am trying to test the effects of normalization on various possible > > treatments. > > Is there a way to make a fake CEL file? > > and if there is, how do you know the corresponding x and y locations > on > > the chip that correspond with the various affy ids? I know that this > > information is located in the various cdf files, but I am unaware of > how > > to access that information. > > > > Thanks for any help, > > > > > > Richard Park > > Immunology - Computational Data Analyzer > > Joslin Diabetes Center > > Ph: 617-732-2482 > > Richard.Park@joslin.harvard.edu > > > > _______________________________________________ > > Bioconductor mailing list > > Bioconductor@stat.math.ethz.ch > > https://www.stat.math.ethz.ch/mailman/listinfo/bioconductor > > > > _______________________________________________ > Bioconductor mailing list > Bioconductor@stat.math.ethz.ch > https://www.stat.math.ethz.ch/mailman/listinfo/bioconductor >

ADD COMMENT • link 22.3 years ago mai98ftu@studserv.uni-leipzig.de ▴ 140

Login before adding your answer.