Question

Re: Use of RMA in increasingly-sized datasets

0

Entering edit mode

t-kawai@hhc.eisai.co.jp ▴ 50

@t-kawaihhceisaicojp-503

Last seen 9.7 years ago

I am in just the same problem as yours now. I think there are two key steps in the RMA that depend on the set of chips in a run. One is quantile normalization step, and the other is median polish summarization step. The target value of each quantile of probe intensity is the geometrical mean calculated from the same qunatiles across the entire chip set in the run. And the expression values summarized from 11-20 probe intensities are calculated from median polish algorithm using the probe sets across the entire chip set. Therefore, the suggestion of the usage of "a standard training 50 chip set" is effective in practice, because the fluctuation of quantile target value is quite a little after adding one chip data to 50 chip standard set, and the median values used in the summarization step are robust enough for the 51 chip data set. But this method is very tedious when we process several chip data one by one, and to create the standard set is impossible at the beginning of a project. I am looking forward to hearing some good solution on this problem, too. Bye. Kawai _______________________________________ Takatoshi Kawai, Ph.D. Senior Scientist, Bioinformatics Laboratory of Seeds Finding Technology Eisai Co., Ltd. 5-1-3 Tokodai, Tsukuba-shi, Ibaraki 300-2635, Japan TEL: +81-29-847-7192 FAX: +81-29-847-7614 e-mail: t-kawai@hhc.eisai.co.jp

Normalization probe PROcess Normalization probe PROcess • 671 views

ADD COMMENT • link updated 19.0 years ago by Swati Ranade ▴ 60 • written 19.0 years ago by t-kawai@hhc.eisai.co.jp ▴ 50

score 0 · Answer 1 · 2005-06-06

Hi Kawai, I think you might be able to generate a dummy training set from Published microarray data!! Swati ------------------------------------------------------- I am in just the same problem as yours now. I think there are two key steps in the RMA that depend on the set of chips in a run. One is quantile normalization step, and the other is median polish summarization step. The target value of each quantile of probe intensity is the geometrical mean calculated from the same qunatiles across the entire chip set in the run. And the expression values summarized from 11-20 probe intensities are calculated from median polish algorithm using the probe sets across the entire chip set. Therefore, the suggestion of the usage of "a standard training 50 chip set" is effective in practice, because the fluctuation of quantile target value is quite a little after adding one chip data to 50 chip standard set, and the median values used in the summarization step are robust enough for the 51 chip data set. But this method is very tedious when we process several chip data one by one, and to create the standard set is impossible at the beginning of a project. I am looking forward to hearing some good solution on this problem, too. Bye. Kawai