I was planning to filter on summarized/normalized Affy expression data
keep the genes in the high 50% of variance, but saw the recent paper
algorithm called FLUSH in NAR that filters on probe level data. What
the recommendation of the collective BioC consciousness about the
data analysis that one should filter data using variance?
Dennis Burian, Ph.D.
Functional Genomics Group
Civil Aerospace Medical Institute, AAM-610
6500 S. MacArthur Blvd.
Oklahoma City OK 73169
dennis.burian at faa.gov
How to filter depends on your purpose. Filtering on variance might be a good way to select genes for a PCA plot for example. If however your intention is to do a DE analysis using limma then you should absolutely not be filtering on variance. For limma (or for any program that borrows information between genes) you should instead filter on expression level or not at all.