Hi everyone,
I've always heard that one of the ways "around" the multiple testing problem of microarrays is for you to a priori identify a particular list of genes you're interested in, and then you only have to do the multiple test correction for this smaller list. I've never done this in practice, and I'm not sure at what point in the analysis it's proper to pull out just the smaller list. Obviously, all the data preprocessing and normalization will be done with all the genes, but should I pull out the genes before fitting the model, or after fitting the model right before the multiple test adjustment? I'm using the eBayes() shrinkage in limma, so which genes are in the model will make a big difference in the outcome.
I'm thinking it would be best to keep all the genes in the model, and then split them out into two groups (genes of interest and all the rest) and do a FDR correction separately for each group. What do you think?
Thanks,
Jenny
Jenny Drnevich, Ph.D.
Functional Genomics Bioinformatics Specialist
W.M. Keck Center for Comparative and Functional Genomics
Roy J. Carver Biotechnology Center
University of Illinois, Urbana-Champaign
330 ERML
1201 W. Gregory Dr.
Urbana, IL 61801
USA
ph: 217-244-7355
fax: 217-265-5066
e-mail: drnevich at illinois.edu