RMA with few arrays
1
0
Entering edit mode
Ann Hess ▴ 340
@ann-hess-251
Last seen 9.6 years ago
I am wondering if it is appropriate to compute expression indices with RMA for only a small number of arrays (2 or 3). I have seen in previous posts that the expression indices can be identical for some probe sets across different arrays when using RMA with few arrays (because of the median polish algorithm). I have observed this for the data in question. Is is appropriate to just remove those probe sets from down stream analysis? Is there a problem with computing RMA for such a small group of arrays? The data was generated by a scientist interested in comparing three treatments. However, they ran a single replicate of each treatment and then reproduced the experiment at a later date. So, there are a total of 6 arrays, but they come from two separate experiments. Currently, they are just interested in comparing two of the treatments. My plan was to run RMA for each of the experiments separately (since I don't think it is appropriate to normalize all the arrays together). Then combine the results and use a paired t-test to test for differential gene expression for the two treatment groups of interest. When I tried this approach (using limma and multtest), the results actually looked good until I took a closer look at the RMA values and noticed the identical expression indices across arrays for some probe sets. Any suggestions would be greatly appreciated. Ann
probe limma probe limma • 825 views
ADD COMMENT
0
Entering edit mode
@james-w-macdonald-5106
Last seen 1 hour ago
United States
Hi Ann, Ann Hess wrote: > I am wondering if it is appropriate to compute expression indices with RMA > for only a small number of arrays (2 or 3). I have seen in previous posts > that the expression indices can be identical for some probe sets across > different arrays when using RMA with few arrays (because of the median > polish algorithm). I have observed this for the data in question. Is is > appropriate to just remove those probe sets from down stream analysis? Is > there a problem with computing RMA for such a small group of arrays? The parameter estimate that is probably not very good in this scenario is the probe effect, which is going to be ignored anyway. So is it a Really Good Thing? Not really. However, I'm not sure that any other method of computing expression values is going to excel in this situation, so what are ya gonna do? In a perfect world you would have hog-tied the Biologist until (s)he agreed to run more duplicates. ;-D > > The data was generated by a scientist interested in comparing three > treatments. However, they ran a single replicate of each treatment and > then reproduced the experiment at a later date. So, there are a total of > 6 arrays, but they come from two separate experiments. Currently, they > are just interested in comparing two of the treatments. > > My plan was to run RMA for each of the experiments separately (since I > don't think it is appropriate to normalize all the arrays together). Then > combine the results and use a paired t-test to test for differential gene > expression for the two treatment groups of interest. I'm not convinced that you should run rma separately. I would first look at the raw data and see if it looks OK to run them all together. A quick look at a density plot for each chip is a good start, then you might try fitPLM() in affyPLM and look at the results of nuse(pset) and RLE(pset). Doing a plot of the first two principal components wouldn't hurt either. If the boxplots that result from nuse() and RLE() look reasonable, the density plots line up relatively close, and the three sample types group sorta close on a PCA plot, then I would run them all together and be happy that you don't have to wrangle with batch effects. > > When I tried this approach (using limma and multtest), the results > actually looked good until I took a closer look at the RMA values and > noticed the identical expression indices across arrays for some probe > sets. The identical expression values are more of an artifact than something to be concerned about. Dollars to donuts, if you run all six together, these same probesets will have super low variance and won't come up as significant anyway (and will likely be filtered out if you filter based on variance or IQR or some such thing). Best, Jim > > Any suggestions would be greatly appreciated. > > Ann > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor -- James W. MacDonald, M.S. Biostatistician Affymetrix and cDNA Microarray Core University of Michigan Cancer Center 1500 E. Medical Center Drive 7410 CCGC Ann Arbor MI 48109 734-647-5623 ********************************************************** Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues.
ADD COMMENT

Login before adding your answer.

Traffic: 737 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6