Removing probes before or after normalization
1
0
Entering edit mode
@bornman-daniel-m-1391
Last seen 9.7 years ago
Dear BioC, I have a cutom chip with multiple microbial organisms but I am currently only interested in the results for one of these. At what step in the analysis process is it adviced to remove the other organisms from analysis. I worry that probes specific to those 'other' organisms may contribute to the background noise. In that case maybe I should remove them prior to normalization and background correction. Otherwise, maybe prior to independent testing and p-value adjustment. And, if not there, then prior to annotation. Thank You, Daniel Bornman Researcher Battelle Memorial Institute 505 King Ave Columbus, OH 43201 614.424.3229
Annotation Normalization PROcess Annotation Normalization PROcess • 1.5k views
ADD COMMENT
0
Entering edit mode
Jenny Drnevich ★ 2.2k
@jenny-drnevich-382
Last seen 9.7 years ago
Hi Daniel, I have been wondering about this myself recently. I think all examples of filtering genes that I have seen do the filtering after the pre- processing steps, which is what I routinely do. I don't think I've seen a formal argument for this anywhere, but it seems that genes that are "Absent" (Affy calls) from all arrays and/or genes that have little variation across arrays (although I don't personally filter on this) are a part of those genes that do not change expression with treatment. Given that most normalization methods assume that most genes are not changing, you would not want to remove a portion of these genes before normalization, else you are increasing the proportion of genes that do change and perhaps decreasing the efficacy of the normalization? On the other hand, I have also worked with Affy's soybean chips, which have probe sets from two other species (pests, I believe) in addition to soybeans. In this case, we removed the non-soybean genes before pre-processing, mostly because we were running into memory problems. I hope we are not being arbitrary in removing non-species-of-interest genes before normalization and then filtering species-specific genes after normalization using different criteria! Any other thoughts? Cheers, Jenny At 01:34 PM 4/21/2006, Bornman, Daniel M wrote: >Dear BioC, > >I have a cutom chip with multiple microbial organisms but I am currently >only interested in the results for one of these. At what step in the >analysis process is it adviced to remove the other organisms from >analysis. I worry that probes specific to those 'other' organisms may >contribute to the background noise. In that case maybe I should remove >them prior to normalization and background correction. Otherwise, maybe >prior to independent testing and p-value adjustment. And, if not there, >then prior to annotation. > > >Thank You, > >Daniel Bornman >Researcher >Battelle Memorial Institute >505 King Ave >Columbus, OH 43201 >614.424.3229 > >_______________________________________________ >Bioconductor mailing list >Bioconductor at stat.math.ethz.ch >https://stat.ethz.ch/mailman/listinfo/bioconductor >Search the archives: >http://news.gmane.org/gmane.science.biology.informatics.conductor Jenny Drnevich, Ph.D. Functional Genomics Bioinformatics Specialist W.M. Keck Center for Comparative and Functional Genomics Roy J. Carver Biotechnology Center University of Illinois, Urbana-Champaign 330 ERML 1201 W. Gregory Dr. Urbana, IL 61801 USA ph: 217-244-7355 fax: 217-265-5066 e-mail: drnevich at uiuc.edu
ADD COMMENT
0
Entering edit mode
Speaking as a statistician and not from experience with this type of array, I would think that you would want to remove foreign probes before normalization. If they hybridize to the sample, who knows what they are doing. If they do not hybridize, they should have add a huge number of data points at the very low expression values. These probes are all "negative controls" and do not have the same sources of variation as the real probes. Hence, I think that they could adversely affect the normalization of low expression genes, which are often the most interesting genes in the data. Again, this argument is not based on experience. However, I have used arrays with about a hundred negative controls and these controls did have some surprisingly consistent patterns, showing they were not entirely negative. --Naomi At 02:59 PM 4/21/2006, Jenny Drnevich wrote: >Hi Daniel, > >I have been wondering about this myself recently. I think all examples of >filtering genes that I have seen do the filtering after the pre- processing >steps, which is what I routinely do. I don't think I've seen a formal >argument for this anywhere, but it seems that genes that are "Absent" (Affy >calls) from all arrays and/or genes that have little variation across >arrays (although I don't personally filter on this) are a part of those >genes that do not change expression with treatment. Given that most >normalization methods assume that most genes are not changing, you would >not want to remove a portion of these genes before normalization, else you >are increasing the proportion of genes that do change and perhaps >decreasing the efficacy of the normalization? On the other hand, I have >also worked with Affy's soybean chips, which have probe sets from two other >species (pests, I believe) in addition to soybeans. In this case, we >removed the non-soybean genes before pre-processing, mostly because we were >running into memory problems. I hope we are not being arbitrary in removing >non-species-of-interest genes before normalization and then filtering >species-specific genes after normalization using different criteria! Any >other thoughts? > >Cheers, >Jenny > >At 01:34 PM 4/21/2006, Bornman, Daniel M wrote: > > >Dear BioC, > > > >I have a cutom chip with multiple microbial organisms but I am currently > >only interested in the results for one of these. At what step in the > >analysis process is it adviced to remove the other organisms from > >analysis. I worry that probes specific to those 'other' organisms may > >contribute to the background noise. In that case maybe I should remove > >them prior to normalization and background correction. Otherwise, maybe > >prior to independent testing and p-value adjustment. And, if not there, > >then prior to annotation. > > > > > >Thank You, > > > >Daniel Bornman > >Researcher > >Battelle Memorial Institute > >505 King Ave > >Columbus, OH 43201 > >614.424.3229 > > > >_______________________________________________ > >Bioconductor mailing list > >Bioconductor at stat.math.ethz.ch > >https://stat.ethz.ch/mailman/listinfo/bioconductor > >Search the archives: > >http://news.gmane.org/gmane.science.biology.informatics.conductor > >Jenny Drnevich, Ph.D. > >Functional Genomics Bioinformatics Specialist >W.M. Keck Center for Comparative and Functional Genomics >Roy J. Carver Biotechnology Center >University of Illinois, Urbana-Champaign > >330 ERML >1201 W. Gregory Dr. >Urbana, IL 61801 >USA > >ph: 217-244-7355 >fax: 217-265-5066 >e-mail: drnevich at uiuc.edu > >_______________________________________________ >Bioconductor mailing list >Bioconductor at stat.math.ethz.ch >https://stat.ethz.ch/mailman/listinfo/bioconductor >Search the archives: >http://news.gmane.org/gmane.science.biology.informatics.conductor Naomi S. Altman 814-865-3791 (voice) Associate Professor Dept. of Statistics 814-863-7114 (fax) Penn State University 814-865-1348 (Statistics) University Park, PA 16802-2111
ADD REPLY

Login before adding your answer.

Traffic: 397 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6