RMA with few arrays

0

Entering edit mode

Ann Hess ▴ 340

@ann-hess-251

Last seen 9.6 years ago

I am wondering if it is appropriate to compute expression indices with RMA for only a small number of arrays (2 or 3). I have seen in previous posts that the expression indices can be identical for some probe sets across different arrays when using RMA with few arrays (because of the median polish algorithm). I have observed this for the data in question. Is is appropriate to just remove those probe sets from down stream analysis? Is there a problem with computing RMA for such a small group of arrays? The data was generated by a scientist interested in comparing three treatments. However, they ran a single replicate of each treatment and then reproduced the experiment at a later date. So, there are a total of 6 arrays, but they come from two separate experiments. Currently, they are just interested in comparing two of the treatments. My plan was to run RMA for each of the experiments separately (since I don't think it is appropriate to normalize all the arrays together). Then combine the results and use a paired t-test to test for differential gene expression for the two treatment groups of interest. When I tried this approach (using limma and multtest), the results actually looked good until I took a closer look at the RMA values and noticed the identical expression indices across arrays for some probe sets. Any suggestions would be greatly appreciated. Ann

probe limma probe limma • 825 views

ADD COMMENT • link updated 17.6 years ago by James W. MacDonald 65k • written 17.6 years ago by Ann Hess ▴ 340

0

Entering edit mode

James W. MacDonald 65k

@james-w-macdonald-5106

Last seen 1 hour ago

United States

Hi Ann, Ann Hess wrote: > I am wondering if it is appropriate to compute expression indices with RMA > for only a small number of arrays (2 or 3). I have seen in previous posts > that the expression indices can be identical for some probe sets across > different arrays when using RMA with few arrays (because of the median > polish algorithm). I have observed this for the data in question. Is is > appropriate to just remove those probe sets from down stream analysis? Is > there a problem with computing RMA for such a small group of arrays? The parameter estimate that is probably not very good in this scenario is the probe effect, which is going to be ignored anyway. So is it a Really Good Thing? Not really. However, I'm not sure that any other method of computing expression values is going to excel in this situation, so what are ya gonna do? In a perfect world you would have hog-tied the Biologist until (s)he agreed to run more duplicates. ;-D > > The data was generated by a scientist interested in comparing three > treatments. However, they ran a single replicate of each treatment and > then reproduced the experiment at a later date. So, there are a total of > 6 arrays, but they come from two separate experiments. Currently, they > are just interested in comparing two of the treatments. > > My plan was to run RMA for each of the experiments separately (since I > don't think it is appropriate to normalize all the arrays together). Then > combine the results and use a paired t-test to test for differential gene > expression for the two treatment groups of interest. I'm not convinced that you should run rma separately. I would first look at the raw data and see if it looks OK to run them all together. A quick look at a density plot for each chip is a good start, then you might try fitPLM() in affyPLM and look at the results of nuse(pset) and RLE(pset). Doing a plot of the first two principal components wouldn't hurt either. If the boxplots that result from nuse() and RLE() look reasonable, the density plots line up relatively close, and the three sample types group sorta close on a PCA plot, then I would run them all together and be happy that you don't have to wrangle with batch effects. > > When I tried this approach (using limma and multtest), the results > actually looked good until I took a closer look at the RMA values and > noticed the identical expression indices across arrays for some probe > sets. The identical expression values are more of an artifact than something to be concerned about. Dollars to donuts, if you run all six together, these same probesets will have super low variance and won't come up as significant anyway (and will likely be filtered out if you filter based on variance or IQR or some such thing). Best, Jim > > Any suggestions would be greatly appreciated. > > Ann > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor -- James W. MacDonald, M.S. Biostatistician Affymetrix and cDNA Microarray Core University of Michigan Cancer Center 1500 E. Medical Center Drive 7410 CCGC Ann Arbor MI 48109 734-647-5623 ********************************************************** Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues.

ADD COMMENT • link 17.6 years ago James W. MacDonald 65k

Login before adding your answer.