multi-experiment multi-chip summarization

0

Entering edit mode

Anna Lobley ▴ 10

@anna-lobley-2621

Last seen 11.3 years ago

I am trying to integrate several diverse affy datasets from the GEO database all from the same organism same chip design (~500 samples). I'm interested in obtaining absolute expression values and have created a robust target distribution for quantile normalisation over all samples (carried out externally from R). Due to cpu memory requirements I have only been able to run median polish for probe summarization on the dataset within single GEO experiments rather than across all of the quantile normalised data in one go. My question is fairly open ended and as follows: I'm concerned that this methodology will over-emphasize variance between experiments that could be avoided using median polish over the entire dataset. Is there a more appropriate way of carrying out summarization on this large dataset? thanks in advance for opinions/help

GO probe affy GO probe affy • 830 views

ADD COMMENT • link updated 17.9 years ago by Henrik Bengtsson ★ 2.4k • written 17.9 years ago by Anna Lobley ▴ 10

0

Entering edit mode

Henrik Bengtsson ★ 2.4k

@henrik-bengtsson-4333

Last seen 19 months ago

United States

On Jan 30, 2008 4:04 AM, Anna Lobley <a.lobley at="" cs.ucl.ac.uk=""> wrote: > I am trying to integrate several diverse affy datasets > from the GEO database all from the same organism > same chip design (~500 samples). > > I'm interested in obtaining absolute expression values > and have created a robust target distribution for quantile normalisation > over all samples (carried out externally from R). > > Due to cpu memory requirements I have only > been able to run median polish for probe summarization on the dataset > within single GEO experiments rather than across all of the quantile > normalised data in one go. > > My question is fairly open ended and as follows: > I'm concerned that this methodology will over-emphasize > variance between experiments that could be avoided using > median polish over the entire dataset. Is there a more > appropriate way of carrying out summarization on this large > dataset? If you've got your data as CEL files with a corresponding CDF file, you could try the aroma.affymetrix package: http://www.braju.com/R/aroma.affymetrix/ /Henrik > > thanks in advance for opinions/help > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor >

ADD COMMENT • link 17.9 years ago Henrik Bengtsson ★ 2.4k

Login before adding your answer.