Question

Affymetrix: RMA probe summarization for identical probe sequences

0

Entering edit mode

Gaj Stan BIGCAT ▴ 20

@gaj-stan-bigcat-4764

Last seen 11.4 years ago

Hello all, I have a very specific question regarding the working mechanisms behind the probe summarization step during RMA normalization. Let?s say that I have a reannotated probeset (customCDF) on an Affymetrix chip that looks like this: http://arrayanalysis.mbni.med.umich.edu/pro beset/ps_pb.jsp?p=ENSG00000087076&c=HGU133Plus2_Hs_ENSG_13 This reannotated probeset seems to contain several identical probes, but these are located on a (physically) different location on the chip (although they?re not that far away from each other). How are these identical probes handled during the probe summarization step? As independant measurements? Averaged prior to summarization? Or anything else? In the end, what effect would these repeated sequences have on the calculated (median-polished) probeset intensity? If I understand the approach correctly, it will not have a drastic effect on the outcome, since the assumption is that all probes in a given probeset should measure the same intensity... Many thanks in advance! -- Stan P.S.: I'm currently in disagreement with the BioC maillist settings. Therefore, my apologies if this post passed this list twice!

Normalization probe Normalization probe • 1.2k views

ADD COMMENT • link updated 14.6 years ago by Kasper Daniel Hansen ★ 6.5k • written 14.6 years ago by Gaj Stan BIGCAT ▴ 20

score 0 · Answer 1 · 2011-07-20

On Wed, Jul 20, 2011 at 9:45 AM, Gaj Stan (BIGCAT) <stan.gaj at="" maastrichtuniversity.nl=""> wrote: > Hello all, > > I have a very specific question regarding the working mechanisms behind the probe summarization step during RMA normalization. Let?s say that I have a reannotated probeset (customCDF) on an Affymetrix chip that looks like this: http://arrayanalysis.mbni.med.umich.edu/pro beset/ps_pb.jsp?p=ENSG00000087076&c=HGU133Plus2_Hs_ENSG_13 > > This reannotated probeset seems to contain several identical probes, but these are located on a (physically) different location on the chip (although they?re not that far away from each other). How are these identical probes handled during the probe summarization step? As independant measurements? Averaged prior to summarization? Or anything else? > > In the end, what effect would these repeated sequences have on the calculated (median-polished) probeset intensity? If I understand the approach correctly, it will not have a drastic effect on the outcome, since the assumption is that all probes in a given probeset should measure the same intensity... > > Many thanks in advance! When you do median polish, each probe you pass in through the CDF environment will be treated as an independent measurement. If you have many probes that are exactly equal, they will just give more weight to that sequence's behaviour. Note that in the example you have above, when the same sequence is spotted multiple times, it is adjacent to each other, which implies that the probe intensities will be more similar (less spatial variation across the chip) - I think this is a bit weird. I would (perhaps) worry more about this if you have a probeset with sequence 1 spotted 10 times and sequence 2-5 each spotted 1 time. In that case, median polish will put a lot of weight on sequence 1. You could argue that identical sequences should be filtered out of the CDF file, but that will need to be done when you create the CDF environment, this is not something rma does. In fact, rma does not "know" about the actual probe sequence. > In the end, what effect would these repeated sequences have on the calculated (median-polished) probeset intensity? If I understand the approach correctly, it will not have a drastic effect on the outcome, since the assumption is that all probes in a given probeset should measure the same intensity... This is an assumption. This may or may not be true in the real world. Kasper