I am currently doing data analysis of microarrays. There are 20 arrays, devided into 5 animals and 4 treatments. It is a repeated meassurements experiments.
I have done PCA to see if the treatments can explain the variance and saw one array quite far from the rest of arrays. This happens when using the whole data set (PCA1 = -200, PCA2 = 50) and then the data set having only the differentially expressed genes (PCA1 =,-100, PCA2 = -30) .
A more graduated approach might be to use arrayWeights, which should assign a lower weight to any outlier array with variable signal relative to its replicates. This reduces its influence on the linear modelling, DE testing, etc. without requiring the drastic action of tossing out the array altogether. I prefer not to remove arrays if possible, as that means I'm throwing out data and reducing residual d.f. to estimate the variance/power to detect DE (as you might have witnessed yourself, from the reduction in DE genes when the affected animal is removed). It's also hard to draw the line between what is an outlier and what isn't when you have small numbers of samples.