Affy: gene filtering before or after normalization??
1
0
Entering edit mode
Jing Shen ▴ 10
@jing-shen-471
Last seen 9.6 years ago
Hi, I am going to work on affymetrix data analysis using Bioconductor Affy package. In my understanding, the procedure for data analysis should be: (1) import data (CEL files) (2) data filtering ?? --- get rid of bad or false intensities (e.g.,sth like filtering on flags, present or absent or expression values in GeneSpring) (3) data normalization --- several different methods based on probe cell level or probe set level... (4) now you have the data for statistical analysis... I am wondering if anybody can give me some suggestions on data filtering before (or after??) my data normalization if I use RMA() or expresso()? Or what kind of gene filtering criteria do you guys use? or I don't need to do that at all? Thanks, Jing
Normalization probe Normalization probe • 1.9k views
ADD COMMENT
0
Entering edit mode
@wolfgang-huber-3550
Last seen 3 months ago
EMBL European Molecular Biology Laborat…
Hi Jing, in my experience (and it seems to be that of others) quality control, including gene filtering, should be done while and after normalization, and there is not too much to be done before. I.e. if you use a model- based normalization, you can use the residuals for QC, you can use the reproducibility of the per-probe intensities across chips for QC, and since with Affy genechips you have multiple probes per gene you should use that, too. I think Francois Collins has a nice method for QC based on the residuals of the probe-set-summary model. All this assuming that there are no obvious scratches, fingerprints, gradients etc on the chip image itself, in which case you should probably send them back to the lab... Best wishes Wolfgang ------------------------------------- Wolfgang Huber Division of Molecular Genome Analysis German Cancer Research Center Heidelberg, Germany Phone: +49 6221 424709 Fax: +49 6221 42524709 Http: www.dkfz.de/abt0840/whuber ------------------------------------- On Thu, 9 Oct 2003, Jing Shen wrote: > Hi, > > I am going to work on affymetrix data analysis using Bioconductor Affy > package. In my understanding, the procedure for data analysis should be: > (1) import data (CEL files) > (2) data filtering ?? --- get rid of bad or false intensities (e.g.,sth > like filtering on flags, present or absent or expression values in > GeneSpring) > (3) data normalization --- several different methods based on probe cell > level or probe set level... > (4) now you have the data for statistical analysis... > I am wondering if anybody can give me some suggestions on data filtering > before (or after??) my data normalization if I use RMA() or expresso()? > Or what kind of gene filtering criteria do you guys use? or I don't need > to do that at all? > > Thanks, > Jing > > _______________________________________________ > Bioconductor mailing list > Bioconductor@stat.math.ethz.ch > https://www.stat.math.ethz.ch/mailman/listinfo/bioconductor >
ADD COMMENT
0
Entering edit mode
Thanks for the referecne to our work, Wolfgang. Hope it isn't considered spamming if I put a link to a poster presentation: http://www.affymetrix.com/corporate/events/seminar/microarray_workshop .affx I concur with Wolfgang that residuals from fits provide great quality assessment opportunities. I differ in the opinion that chips with obvious scratches and what-nots should be summarily dismissed. On arrays where probes are scattered, which includes all of the new designs, glaring image artifacts can sometimes have very little effect on the summaries. To make this judgment, it helps to summarize the residuals in a way that reflects probe set summary precision. We recommend pooling unscaled standard errors over probe sets on a chip to judge the relative quality of chips in a set. These are relative measures within a chip set. Probe set model residual scales must be examined to assess the overall quality of a set of chips. We will soon have a link to a more detailed presentation. A paper and software are due out shortly. -francois w.huber@dkfz-heidelberg.de wrote: Hi Jing, in my experience (and it seems to be that of others) quality control, including gene filtering, should be done while and after normalization, and there is not too much to be done before. I.e. if you use a model- based normalization, you can use the residuals for QC, you can use the reproducibility of the per-probe intensities across chips for QC, and since with Affy genechips you have multiple probes per gene you should use that, too. I think Francois Collins has a nice method for QC based on the residuals of the probe-set-summary model. All this assuming that there are no obvious scratches, fingerprints, gradients etc on the chip image itself, in which case you should probably send them back to the lab... Best wishes Wolfgang ------------------------------------- Wolfgang Huber Division of Molecular Genome Analysis German Cancer Research Center Heidelberg, Germany Phone: +49 6221 424709 Fax: +49 6221 42524709 Http: www.dkfz.de/abt0840/whuber ------------------------------------- On Thu, 9 Oct 2003, Jing Shen wrote: > Hi, > > I am going to work on affymetrix data analysis using Bioconductor Affy > package. In my understanding, the procedure for data analysis should be: > (1) import data (CEL files) > (2) data filtering ?? --- get rid of bad or false intensities (e.g.,sth > like filtering on flags, present or absent or expression values in > GeneSpring) > (3) data normalization --- several different methods based on probe cell > level or probe set level... > (4) now you have the data for statistical analysis... > I am wondering if anybody can give me some suggestions on data filtering > before (or after??) my data normalization if I use RMA() or expresso()? > Or what kind of gene filtering criteria do you guys use? or I don't need > to do that at all? > > Thanks, > Jing > > _______________________________________________ > Bioconductor mailing list > Bioconductor@stat.math.ethz.ch > https://www.stat.math.ethz.ch/mailman/listinfo/bioconductor > _______________________________________________ Bioconductor mailing list Bioconductor@stat.math.ethz.ch https://www.stat.math.ethz.ch/mailman/listinfo/bioconductor [[alternative HTML version deleted]]
ADD REPLY

Login before adding your answer.

Traffic: 699 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6